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Chimeras of Hepatitis C Virus and Bovine Viral Diarrhea Vims 

Reference to Government Grant 

This invention was made with government support under a grant from the National 
Institutes of Health, grant numbers PHS CA57973 and AI40034. The government has certain 
rights in this invention. 

5 

Related Applications 

This application claims priority to, and incorporates herein in its entirety, U.S. 
60/082,964 filed April 24, 1998. 

10 Background of the Invention 
(t) Field of the Invention 

This invention relates generally to the development of therapies for treating hepatitis 
C virus (HCV) and bovine viral diarrhea virus (BVDV) and more particularly to the 
identification of such therapies using chimeric viruses comprising a genomic sequence 

1 5 derived from HCV and bovine viral diarrhea virus (BVDV). 
(2) Description of the Related Art 

The Flavivirdae is an important family of human and animal RNA viral pathogens 
(Rice, CM. 1996. Flavivirdae: The viruses and their replication. In: Fields BN, Knipe DM, 
Howley PM., eds. Fields virology. Philadelphia: Lippincott-Raven Publishers, pp. 931-960.) 

20 The three currently recognized genera of the Flavivirdae family exhibit distinct differences in 
transmission, host range, and pathogenesis. For example, members of the classical flavivirus 
genus, such as yellow fever virus and dengue virus, are typically transmitted to vertebrate 
hosts via arthropod vectors and cause acute self-limiting disease (Monath TP, Heinz FX. 
1996. Flaviviruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. New York: 

25 Raven Press, pp. 961-1034). The pesti viruses, such as bovine viral diarrhea virus (BVDV) 
and classical swine fever virus (CSFV), cause economically important livestock disease and 
are spread by direct contact or the fecal-oral route (Thiel et al., 1996. Pestiviruses. In: Fields 
BN, Knipe DM, Howley PM., eds. Fields virology. New York: Raven Press, pp. 1059-1073). 
The most recently characterized Flavivirdae genus is the hepacivirus genus, the sole member 

30 of which is the common and exclusively human pathogen, hepatitis C virus (HCV). HCV is 
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transmitted by contaminated blood or blood products and is the most common agent of non- 
A, non-B hepatitis, affecting more that 1% of the population worldwide (Houghton, 1996. 
Hepatitis C viruses. In: Fields BN, Knipe DM, Howley PM., eds. Fields virology. 
Philadelphia: Lippincott-Raven Publishers, pp. 1035-1058.). Unlike flavivirus and pestivirus 
5 infections, which are usually eliminated by host immune response, chronic HCV infections 
are common and can cause mild to severe liver disease including cancer. 

Despite these differences, members of the Flavivirdae family share common 
structural features and gene expression strategies. Virus particles consist of a lipid bilayer 
envelope with embedded transmembrane glycoproteins surrounding a protein-RNA 

1 0 nucleocapsid. Genome RNAs are single-stranded of positive polarity, and function as the sole 
mRNA species for translation of a single long open reading frame (ORF). This ORF is 
translated into a polyprotein which is processed by cellular and viral proteases into mature 
viral proteins. Structural proteins destined for incorporation into virus particles are encoded 
in the N-terminal portion of the polyprotein, while the nonstructural proteins which form 

1 5 components of the viral RNA replicase are encoded in the remainder. 

Replication of the Flavivirdae RNA genome occurs via synthesis of a full-length 
negative-strand intermediate and is asymmetric, favoring synthesis of positive-strand RNAs. 
However, little is known about the details of this process. For all three genera of the 
Flavivirdae family, full-length functional cDNA clones have been constructed and RNAs 

20 transcribed from these cDNA templates are infectious. For flavi viruses and pestiviruses, 
mutagenesis of these clones and efficient RNA transfection of permissive cell cultures 
provides a means of probing the role of cis RNA elements and viral proteins in replicase 
assembly and function. Such analyses are not yet possible for HCV since this virus is unable 
to replicate efficiently in cell culture. 

25 Like many other RNA viruses, it is believed the 5* and 3' terminal sequences of the 

Flavivirdae contain conserved cu-elements important for translation, RNA replication, and 
packaging (Bukh et ah, Proc. Natl. Acad ScL USA *P:4942-4946, 1992; Deng et al., Nucleic 
Acids Res. 27:1949-1957, 1993; Cahour et al., Virol. 207:68-76, 1995; Kolykhalov et al., J. 
Virol. 70:3363-3371, 1996; Men et al.,7. Virol. 70:3930-3937, 1996; Tanaka et al., J. Virol. 

30 70:3307-3312, 1996; Huang HV. 1997. Evolution of the alphavirus promoter and the ex- 
acting sequences of RNA viruses. In: Saluzzo J-F, Dodet B. eds. Factors in the emergence of 
arbovirus disesases. Paris: Elsevier Press, pp. 65-79; Mandl et al., J. Virol 72:2132-2140, 
1998). The 5' nontranslated region (NTR) functions initially at the level of translation. 
Similar to most cellular mRNAs, flavivirus genome RNAs are translated in a cap-dependent 

35 manner. These RNAs contain a 5' cap structure that is presumably added by virus-encoded 
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RNA triphosphatases, guanylyl-, and methyl-transferases (Rice, 1996, supra). In contrast, the 
translational strategy employed by pestiviruses and HCV is more similar to that of the 
picomaviruses. These RNAs appear to be uncapped and contain long 5* NTRs with cis RNA 
elements that function as internal ribosome entry sites (IRES) for translation initiation at the 
5 polyprotein AUG (Lemon et al. f Semin. Virol 5:274-288, 1997). 

The 5' NTRs of HCV and BVDV have a similar structural and functional organization 
despite containing only short stretches of high sequence identity (Wang et al., Curr. Top. 
Microbiol Immunol 203:99-1 15, 1995; Lemon et al., 1997, supra). The IRES within each 
NTR is located at the 3' end of the NTR at a position proximal to the AUG initiation codon of 

10 the ORE. Although the 5 1 terminal sequence of each of these viruses is apparently not 
required for IRES function (Rijnbrand et al., FEBSLett 365: 1 1 5-1 19, 1995; Honda et al., 
Virology. 222:31-42, 1 996; Rijnbrand et al., J. Virol 77:451-457, 1997), these sequences are 
highly conserved among different strains of HCV (Bukh et al., Proc. Natl Acad. Sci. 
USA. 89A942-4946, 1992) or BVDV (Deng et al., 1993, supra), suggesting they play other 

1 5 roles in viral replication. For example, sequences in the 5' NTR may be required for 

regulating translation versus initiation of negative-strand RNA synthesis. Such regulation 
could occur by direct interaction of 5' and 3* RNA elements or indirectly, via RNA-protein 
interactions. Sequences in the 5' NTR may also modulate packaging versus translation. 
Finally, sequences complementary to the 5* NTR, which are located at the 3' end of negative- 

20 strand RNA, are likely to function in the initiation of positive-strand RNA synthesis. 

The HCV 3 ' NTR contains an internal polypyrimidme tract followed by a highly 
conserved sequence of 98 bases at the 3 ' terminus, which has been shown to be required for 
replication of HCV (U.S. Application Serial No. 08/81 1,566). 

Further elucidation of the role of sequences in the HCV 5 ' and 3 ' NTRs has been 

25 hampered by the inefficient replication of HCV in cell culture. This aspect of HCV biology 
also makes it difficult to identify and test possible antiviral compounds for activity against 
HCV. Thus, a need exists for a system which facilitates investigation of HCV replication and 
therapeutic approaches to control HCV infections. 



30 Summary of the Invention 

Briefly, therefore, the present invention provides novel compositions and methods for 
studying HCV replication which are based on the discovery that chimeras of HCV and BVDV 
genomic sequences can be constructed that are able to replicate in cell culture. The BVDV- 
specific sequence provides the chimeric viral nucleic acid with the ability to replicate in cell 

35 culture, while the HCV-specific sequence allows the chimeric viral nucleic acid to be used to 
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screen possible compounds for anti-viral activity against HCV. It is believed that similar 
replication-competent chimeras can be constructed from HCV and other pestivimses. 

Thus, in one embodiment, the present invention provides a novel, chimeric viral RNA 
in which at least one of the 5' NTR; ORF and 3' NTR regions is chimeric and comprises a 
5 nucleotide sequence from the corresponding region of a pesti virus in operable linkage with a 
nucleotide sequence from the corresponding region of an hepatitis C virus (HCV). The 
chimeric viral RNA is replication-competent. In preferred embodiments, the pestivirus is 
BVDV. 

In other embodiments, the invention provides a polynucleotide comprising a DNA- 

1 0 dependent promoter operably linked to a cDNA of a chimeric viral RNA as described above 
and cells transiently transfected or stably transformed with the polynucleotide. In some 
embodiments the cDNA may encode a dominant selectable marker or an assayable reporter. 

In yet another embodiment the invention provides a method for identifying 
compounds having anti-HCV activity. The method comprises providing a first cell containing 

15 a chimeric viral nucleic acid derived from HCV and a pestivirus as described above and a 
second cell containing the pestivirus, and then comparing the replication efficiency of the 
chimeric viral nucleic acid in the presence and absence of a test compound to the replication 
efficiency of the pestivirus in the presence and absence of the test compound, 
wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 

20 nucleic acid than the pestivirus indicates the compound has anti-HCV activity. 

The invention also provides a genetically-engineered virus which comprises a 
chimeric viral nucleic acid derived from HCV and a pestivirus as described above. In one 
embodiment the genetically-engineered virus comprises virus particles containing at least one 
HCV structural protein and is useful in a vaccine against HCV. In another embodiment, the 

25 genetically-engineered virus is attenuated as compared to the pestivirus and is useful as a 
vaccine against the pestivirus. 

In a still further embodiment, the invention provides a replication-competent BVDV 
vector expressing a heterologous sequence. The BVDV vector comprises the BVDV 
sequences encoding the BVDV replication machinery. In some embodiments, the replication- 

30 competent BVDV vector expresses an antigen and is useful as a vaccine. 



Brief Description of the Drawings 

Figure 1 is a schematic representation of the 5' NTRs of BVDV, HCV, and EMCV 
showing the position of the start codons of the ORF, and the boxes indicating the canonical 
35 IRES elements. 
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Figure 2 shows a schematic representation of BVDV and HCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose with results from BVDV, 5UCV, 

5 BVDV+HCV, and BVDV+HCVdelB3 chimeras shown in Fig. 2 A and results from 
BVDV+HCVdelB2B3, BVDV+HCVdelBlB2B3, BVDV+HCVdelB2B3Hl, and 
BVDV+HCVdelB2B3HlH2 shown in Fig. 2B, where N.D. means not determined 

Figure 3 illustrates the in vitro translation efficiency of BVDV RNA or chimeras 
showing bar graphs of the amount of NT 0 , the N-terminal protein in the BVDV ORF, 

1 0 expressed by the various constructs. 

Figure 4 illustrates a schematic representation of EMCV chimeras, plaque 
phenotypes, reticulocyte translation efficiencies relative to parental BVDV, specific 
infectivities in MDBK cells, titers at 24 and 48 h post-transfection (or 72 h, as indicated), and 
an indication of whether pseudorevertants arose. 

1 5 Figure 5 illustrates a pseudorevertant analyses showing in (Fig. 5 A) the relative 

positions of mutations detected within the plaque-purified variants of passaged 
BVDV+HCVdelBlB2B3, S'EMCV, and 5'HCV, and in (Fig. 5B) the 5' terminal sequences of 
pseudorevertants of BVDV+HCVdelBlB2B3, S'EMCV, and 5'HCV. Novel nucleotides or 
sequences are shown in bold upper case type. Pseudorevertants are numbered and designated 

20 by the suffix ".R'\ The upper case sequence in BVDV+HCVdelBlB2B3 and 

BVDV+HCVdelBlB2B3.Rl is a remnant of downstream BVDV 5* NTR sequences and was 
created during the cloning procedures. 

Figure 6 illustrates the construction of derivatives of 5HCV designed to contain 5' 
termini corresponding to the sequence detected within the three analyzed pseudorevertants. 

25 Fig. 6A shows the 5* terminal sequence of the S'HCV derivatives with the suffix (orig) 

designating a derivative containing the orig inal 5' terminal sequence of the pseudorevertant; 
the suffix (cons) designating a derivative containing the cons ensus tetranucleotide sequence 
5-GUAU at the same position; and novel sequences shown in bold upper case type. Fig. 6B 
shows plaque phenotypes, reticulocyte translation efficiencies relative to parental BVDV, 

30 specific infectivities in MDBK cells, and titers at 24 and 48 h post-transfection are indicated. 
Figure 7 illustrates a single step growth curve for various chimeric constructs 
showing released virus titers measured by performing plaque assays on MDBK cells 
transfected with various constructs. 

Figure 8 illustrates replication of BVDV RNA or chimeric derivatives in transfected 

35 MDBK cells. Equal numbers of MDBK cells (~8x 10 6 ) were electroporated with 5 Dg of 
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each in vitro synthesized RNA. MDBK cells were also transfected with infectious yellow 
fever 1 7D and Sindbis RNAs to provide molecular mass markers. One fifth of the transfected 
cells were seeded on 3 5 -mm dishes and incubated in D-MEM supplemented with 10% horse 
serum for 6 h at 37°C. The media were then replaced with 1 ml of fresh media containing 2 

5 g/ml of actinomycin D and 40 Ci/ml of ^H-uridine. Incubations were continued for 10 h at 
37°C RNAs were isolated as described in Materials and Methods, and 1/4 of the samples 
was denatured in glyoxal and loaded on an agarose gel. (A) Autoradiograph of the dried gel. 
Only the portion of the gel containing the genomic RNAs is shown. (B) Amount of 
radioactivity contained within the displayed fragments as determined by scintillation 

10 counting. BVDV, lane 1; 5'HCV, lane 2; BVDV+HCVdelB2B3, lane 3; 

BVDV+HCVdelB2B3Hl, lane 4; 5*HCV.Rlorig, lane 5; 5HCV.Rlcons, lane 6; 
5*HCV.R3orig, lane 7; S'HCV.RJcons, lane 8; SUCVIUorig, lane 9; STICV^cons, lane 10; 
yellow fever 17D, lane 11; Sindbis, lane 12; non-transfected MDBK cells, lane 13. The 
experiments shown is one of two repetitions which yielded similar results. 

1 5 Figure 9 illustrates the genetic map of plasmid pACNR/BUD. 

Figure 10 illustrates the sequence of low copy number plasmid pACNR/BVDV 
NADL (circular) harboring the functional cDNA of cytopathic BVDV NADL (positive sense 
cDNA 5' to 3 1 ; nt 1-12578. 

Figure 1 1 illustrates the sequence of infectious BVDV NADL (positive sense cDNA 

20 5' to 3'). 

Figure 12 illustrates the sequence of infectious non-cytopathic BVDV NADL lacking 
clns (positive sense cDNA 5' to 3*). 

Figure 13 illustrates the sequence adapted HCV 5* NTR from 5 T HCV/Rl.cons 
(positive sense cDNA 5' to 3'; only the sequence from the 5 ! base to the ATG initiating the 
25 polyprotein is shown). 

Figure 14 illustrates the sequence of adapted HCV 5' NTR from 5HCV/Rl.orig 
(positive sense cDNA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 15 illustrates the sequence of adapted HCV 5OTR from 5'HCV/R2.cons 
30 (positive sense cDNA 5 f to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 16 illustrates the sequence of adapted HCV 5' NTR from 5 ! HCV/R2.orig 
(positive sense cNDA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 
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Figure 17 illustrates the sequence of adapted HCV 5' NTR from 5'HCV/R3.cons 
(positive sense cDNA 5' to 3'; only the sequence from the 5'base to the ATG initiating the 
polyprotein is shown). 

Figure 1 8 illustrates the sequence of adapted HCV 5 'NTR from 5'HCV/R3.orig 
5 (positive sense cDNA 5' to 3'; only the sequence from the 5' base to the ATG initiating the 
polyprotein is shown). 

Figure 19 illustrates the sequence of prototype HCV-BVDV chimera from 
pNADL/5'HR3.orig/3H3'B with the adapted HCV 5'NTR from SUCV/IU.orig and tandem 3* 
NTR elements from HCV followed by BVDV (positive sense cDNA 5' to 3 1 ) as discussed in 
10 Examples. 

Figure 20 illustrates various deletions of the poly U track in the 3'NTR HCV 
sequence of BVDV/HCV chimera p5H-3H33. 

Figure 21 illustrates the schematic representation of functional HCV/-BVDV chimera 
from pCBV/p7. 

15 Figure 22 illustrates the sequence of functional HCV-BVDV chimera from pCBV/p7 

(positive sense cDNA 5 f to 3'). 

Figure 23 illustrates the schematic representation of a HCV/BVDV chimera with 
selectable marker. 

Figure 24 illustrates the sequence of functional HCV-BVDV chimera from 
20 pCB V/p7/IRES-pac expressing a dominant selectable marker conferring resistance to 
puromycin (positive sense cDNA 5' to 3'). 

Figure 25 illustrates the schematic representation of a bicistronic HCV/BVDV 
chimera. 

Figure 26 illustrates the sequence of functional bicistronic chimera expressing the 
25 entire HCV structural region derived from plasmid pNADL/BI#41/HCV str (positive sense 
cDNA 5' to 3') 

Description of the Preferred Embodiments 

In accordance with the present invention, the inventors herein have succeeded in 

30 generating HCV-BVDV chimeric RNAs which are replication competent. Such chimeras are 
useful in screening compounds in vitro for antiviral activity against HCV. In addition, it is 
believed that in vivo replication of HCV-BVDV chimeras according to the invention may be 
attenuated as compared to wild-type BVDV and thus may be useful in vaccinating animals 
against BVDV. It is also believed that the HCV chimeric structures described herein for 

35 BVDV are applicable to other pestiviruses. 
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In the context of this disclosure, the following terms will be defined as follows unless 
otherwise indicated: 

"Cis-acting sequences" means the nucleotide sequences from an RNA virus genome 
that are necessary for recognition of the genomic RNA by specific protein(s) of the RNA 
5 virus or host cell that cany out replication, transcription, translation or packaging of the 
genome. 

"Genetically-engineered virus" means any virus whose genome is different than that 
of a wild-type virus due to a human-made deletion, insertion, or substitution of one or more 
nucleotides to the wild-type viral genome. 
1 0 "Infectious" when used to describe a virus means the virus is capable of entering cells 

and initiating a virus replication cycle, whether or not this leads to the production of new 
RNA virus particles. 

"Nucleotide sequence" as used herein refers to DN A and the corresponding RNA 
sequence where relevant. It will be understood that sequences shown in the Figures are DNA 
15 versions of the RNA sequence and that chimeric molecules of the invention may comprises 
RNA molecules or cDNA copies of such RNA molecules. 

"Replication-competent" as applied to a chimeric HCV-pesti virus RNA means the 
RNA is capable of RNA-dependent replication in at least one cell type that supports 
replication of the wild-type parental pesti virus. The number of replicated RNA molecules 
20 produced by an HCV-pestivrrus chimeric RNA of the invention is at least 1 0-fold higher than 
the limit of detection, which is typically 10 to 100 molecules. More preferably, chimeric 
RNA production by the HCV-pestivinis chimeric RNA is at least 10 2 to 10 3 -fold higher than 
the detection limit. The replication-competent chimeric RNA replicates at an efficiency that 
is preferably, at least 0.001%, more preferably, at least 0.01%, more preferably, at least 0.1%, 
25 more preferably, at least 1%, more preferably at least 10% and most preferably at least 50% 
up to 90% that of the parental pesti vims in the same cell type. 

"Transfected cell" means a cell containing an exogenously introduced nucleic acid 
molecule, and includes cells that are transiently transfected with the exogenous nucleic acid. 

"Transformed cell" or "stably transformed cell" means a cell containing an 
30 exogenously introduced nucleic acid molecule which is present in the cytoplasm or nucleus of 
the cell and may be stably integrated into the chromosomal DNA of the cell. 

"Virus" means a virion, virus particle or a viral genome. 

A chimeric viral RNA according to the invention is designed to comprise a 5 ' NTR, 
an ORF, and a 3 ' NTR, at least one of which is a chimeric region containing two operably 
35 linked nucleotide sequences that are from the same region of a pesti virus and an HCV. 
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Pestivirus-specific sequences useful in the invention can be taken from the appropriate 
genomic region of any cytopathic or noncytopathic type I or type II BVDV isolate, classical 
swine fever vims (CSFV) isolate, or border disease viral isolate. For a list of pestiviruses , 
see Thiel, H.-J., P. G. W. Plagemann, and V. Moennig. 1996. Pestiviruses, p. 1059-1073. In 
5 B. N. Fields, D. M. Knipe and P. M. Howley (ed.). Fields Virology. Raven Press, New York. 
HCV-specific sequences can be taken from any strain or isolate of HCV, including but not 
limited to HCV-1, HCV-la, HCV-lb, HCV-lc, HCV-2a, HCV-2b, HCV-2c, HCV-3a . 
Preferably, the parental pestivirus is a cytopathic strain of BVDV and the parental HCV strain 
is HCV-1. 

1 0 The pestivirus- and HCV-specific sequences are operably linked in the chimeric 

region, meaning the sequences are arranged such that the resulting chimeric structure is 
functional in the context of replication of the pestivirus. For example, in one preferred 
embodiment the chimeric viral RNA comprises a chimeric 5' NTR which comprises a 
BVDV-specific 5' terminal sequence of 5 -(G/A)UAU and an IRES derived from HCV, with 

1 5 the ORF and the 3 ' NTR consisting of a sequence from the same regions of BVDV. The 

BVDV-specific sequences at the 5 ' terminus and in the ORF and 3 ' NTR are chosen such that 
they are functional in the context of BVDV, meaning the chimeric viral RNA expresses the 
replication machinery of BVDV and this replication machinery is capable of replicating the 
chimeric RNA. In addition, translation of the BVDV ORF in the chimeric viral RNA is 

20 dependent upon a functional HCV IRES. The presence of a functional HCV IRES in this 

chimera allows the chimera to be used to screen for compounds that target the HCV IRES and 
thereby inhibit translation of the BVDV ORF as well as replication of the chimeric virus. 
Such compounds would be expected to also inhibit translation of the ORF in a wild-type HCV 
and consequently inhibit HCV replication. 

25 Compounds that could be screened for anti-HCV activity using this and other HCV- 

BVDV 5' NTR chimeras include but are not limited to antisense RNAs, RNA decoys that 
bind proteins involved in recognition of the HCV-specific sequences, ribozymes, and small 
molecule inhibitors of critical RNA-protein interactions. The use of such substances for 
therapeutic applications are known in the art. See, e.g., Amarzguioui M, et al., "Hammerhead 

30 ribozyme design and application." Cell Mot Life Sci. 1998 Nov;54(ll):l 175-202; Welch PJ, 
et al., "Expression of ribozymes in gene transfer systems to modulate target RNA levels.", 
Curr Opin BiotechnoL 1998 Oct;9(5):486-96; Bramlage B, et al. "Designing ribozymes for 
the inhibition of gene expression."; Trends BiotechnoL 1998 Oct;16(10):434-8; Gewirtz AM, 
et al. "Nucleic acid therapeutics: state of the art and future prospects."; Blood. 1998 Aug 

35 1;92(3):7 12-36; Aitman S., "RNase P in research and therapy." Biotechnology (N Y). 1995 
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Apr,13(4):327-9; Flanagan WM., "Antisense comes of age."; Cancer Metastasis Rev. 1998 
Jun; 17(2): 169-76; Agrawal S, et al., "Antisense therapeutics." Curr Opin Chem Biol 1998 
Aug;2(4):519-28; Caselmann WH, et al., "Synthetic antisense oligodeoxynucleotides as 
potential drugs against hepatitis C." Intervirology 1997;40(5-6):394-9; Neckers LM., 
5 "Oligodeoxynucleotide inhibitors of function: mRNA and protein interactions." Cancer J Sci 
Am. 1998 May;4 Suppl l:S35-42; Agrawal S, et al. "Mixed backbone oligonucleotides: 
improvement in oligonucleotide-induced toxicity in vivo." Antisense Nucleic Acid Drug Dev. 
1998 Apr;8(2): 135-9; Crooke ST., "An overview of progress in antisense therapeutics." 
Antisense Nucleic Acid Drug Dev. 1998 Apr;8(2): 1 15-22; Fraisier C, et al., "High level 

1 0 inhibition of HTV replication with combination RNA decoys expressed from an HTV-Tat 
inducible vector."; Gene Ther. 1998 Dec;5(12): 1665-76; Gervaix A, et al. "Gene therapy 
targeting peripheral blood CD34+ hematopoietic stem cells of HIV-infected individuals." 
Hum Gene Ther 1997 Dec 10;8(18):2229-38; Nakaya T, et al. "Inhibition of HIV-1 
replication by targeting the Rev protein." Leukemia 1997 Apr,l 1 Suppl 3: 134-7; Nakaya T, et 

15 al . "Decoy approach using RNA-DNA chimera oligonucleotides to inhibit the regulatory 
function of human immunodeficiency virus type 1 Rev protein." Antimicrob Agents 
Chemother. 1997 Feb;41(2):3 19-25; Smith C, et al. "Transient protection of human T-cells 
from human immunodeficiency virus type 1 infection by transduction with adeno-associated 
viral vectors which express RNA decoys." Antiviral Res. 19% Oct;32(2):99-l 15; Bahner 1, et 

20 al. "Transduction of human CD34+ hematopoietic progenitor cells by a retroviral vector 
expressing an RRE decoy inhibits human immunodeficiency virus type I replication in 
myelomonocytic cells produced in long-term culture." J Virol. 1996 Jul ;70(7): 43 52-60; Lee 
SW, et al. "Inhibition of human immunodeficiency virus type 1 in human T cells by a potent 
Rev response element decoy consisting of the 13-nucleotide minimal Rev-binding domain." J 

25 Virol. 1994 Dec;68(12):8254-64; Lisziewicz J, et al. "Inhibition of human immunodeficiency 
virus type 1 replication by regulated expression of a polymeric Tat activation response RNA 
decoy as a strategy for gene therapy in AIDS." Proc Natl Acad Sci USA. 1993 Sep 
1;90( 17): 8000-4; Bevec D, et al. "Inhibition of human immunodeficiency virus type 1 
replication in human T cells by retroviral-mediated gene transfer of a dominant-negative Rev 

30 trans-activator." Proc Natl Acad Sci USA. 1992 Oct 15;89(20):9870-4. 

It is contemplated that a number of replication-competent chimeric structures can be 
made that allow the function of various HCV sequence elements and proteins to be studied 
and targeted in drug screening assays. For example, the invention includes replication- 
competent HCV-pestivirus chimeras having a chimeric ORF. One such chimeric ORF is one 

35 comprising an HCV sequence encoding the structural proteins and a pestivirus sequence 
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encoding the nonstructural proteins. It is believed that upon introduction into a cell, such a 
HCV-BVDV ORF chimera will produce HCV-like virus particles that will be released from 
the cell and capable of infecting cells normally infected by wild-type HCV, i.e., cells 
expressing an HCV receptor such as human CD81. Such ORF chimeras would be useful to 
5 screen compounds for drugs that inhibit formation, release or entry of HCV particles. In 
addition, ORF chimeras that produce virus particles containing at least one HCV structural 
protein would be useful as vaccines against HCV. Other ORF chimeras contemplated by the 
invention include, for example, chimeras comprising a pesti virus sequence encoding 
structural proteins and an HCV sequence encoding one or more nonstructural proteins such as 

10 the NS3 protease, NS4A cofactor, NS5 A phosphoprotein/interferon resistance determinant 
and/or the NS5B polymerase. Replication of such ORF chimeras would be dependent upon 
the function of the HCV nonstructural protein(s) and these ORF chimeras could be used to 
screen for drugs that target the HCV nonstructural protein(s) as well as to screen for and map 
potential drug resistance mutations in HCV nonstructural proteins. In addition, HCV- 

1 5 pestivirus ORF chimeras could be useful for developing alternative in vivo animal models for 
HCV replication and HCV-associated hepatocellular carcinoma to evaluate antivirals and 
anti-tumor agents. 

The invention also provides replication-competent HCV-pesti virus chimeras having a 
chimeric 3 ' NTR which contains one or more conserved elements of the HCV 3 ' NTR. Such 

20 3 ' NTR chimeras would be useful for screening or evaluating compounds targeted against the 
HCV 3' NTR. Compounds that could be screened include antisense RNA molecules, 
ribozymes and small molecule inhibitors of critical RNA-protein interactions. One 3' NTR 
chimera according to the invention comprises a BVDV 5 ' NTR, BVDV ORF and a chimeric 
3 ' NTR which consists of an HCV-specific sequence derived from the HCV 3' NTR 

25 immediately followed by a BVDV 3 ' NTR. The HCV-specific 3 ' NTR that allows for 

replication in the context of BVDV has a deletion in the 3 ' NTR poly (U) tract but has all the 
other HCV 3 ' NTR elements, including the 98 bp 3 ' terminal conserved element 

HCV-pestivirus chimeras included within the scope of the invention include those 
comprising combinations of chimeric regions, i.e., 5' NTR and ORF chimeras; 5' NTR and 3' 

30 NTR chimeras; ORF and 3 ' NTR chimeras; and chimeric RNAs in which each of the 5' NTR, 
ORF and 3 ' NTR regions comprise an HCV sequence operably linked to a pestivirus 
sequence. 

The invention also provides chimeric RNAs having two ORFs, or bicistronic HCV- 
pestivirus chimeras. Bicistronic chimeras contemplated by the invention include structures in 
35 which the first ORF contains one or more HCV genes and is followed by a second IRES 
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operably linked to a second ORF encoding the pesti virus replicase machinery. It is also 
contemplated the first ORF may encode a heterologous sequence such as an antigen. 

It is believed that many HCV-pestivirus chimeras of the invention will be attenuated 
as compared to the parental wild-type pestivirus. Such attenuated chimeric RNA genomes 
5 would be candidate vaccines in the form of live-attenuated virus particles or as RNA or 
cDNA "genetic" vaccines. 

The invention also includes vaccines against HCV which comprise an 
immunogenically-effective amount of HCV-pestivirus particles or nucleic acid. Anti-HCV 
vaccines comprising virus particles should preferably contain one or more HCV structural 
10 proteins. 

The therapeutic or pharmaceutical compositions of the present invention can be 
administered by any suitable route known in the art including for example by injection such 
as intraperitoneal, intravenous, subcutaneous, intramuscular, transdermal, intrathecal or 
intracerebral injection. Administration can be either rapid as by injection or over a period of 

1 5 time as by slow infusion or administration of slow release formulation. 

Compositions according to the invention can be employed in the form of 
pharmaceutical or veterinary preparations. Such preparations are made in a manner well 
known in the pharmaceutical and veterinary arts. One preferred preparation utilizes a vehicle 
of physiological saline solution, but it is contemplated that other pharmaceutical ly acceptable 

20 carriers such as physiological concentrations of other non-toxic salts, five percent aqueous 
glucose solution, sterile water or the like may also be used. It may also be desirable that a 
suitable buffer be present in the composition. Such solutions can, if desired, be lyophilized 
and stored in a sterile ampoule ready for reconstitution by the addition of sterile water for 
ready injection. The primary solvent can be aqueous or alternatively non-aqueous. 

25 The carrier can also contain other pharmaceutically-acceptable excipients for 

modifying or maintaining the pH, osmolality, viscosity, clarity, color, sterility, stability, rate 
of dissolution, or odor of the formulation. Similarly, the carrier may contain still other 
pharmaceutically-acceptable excipients for modifying or maintaining release or absorption or 
penetration across the blood-brain barrier. Such excipients are those substances usually and 

30 customarily employed to formulate dosages for parenteral administration in either unit dosage 
or multi-dose form or for direct infusion into the cerebrospinal fluid by continuous or periodic 
infusion. 

It is also contemplated that certain formulations containing a chimeric virus according 
to the invention are to be administered orally. Such formulations are preferably encapsulated 
35 and formulated with suitable carriers in solid dosage forms. Some examples of suitable 
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carriers, excipients, and diluents include lactose, dextrose, sucrose, sorbitol, mannitol, 
starches, gum acacia, calcium phosphate, alginates, calcium silicate, microcrystalline 
cellulose, polyvinylpyrrolidone, cellulose, gelatin, syrup, methyl cellulose, methyl- and 
propylhydroxybenzoates, talc, magnesium, stearate, water, mineral oil, and the like. The 
5 formulations can additionally include lubricating agents, wetting agents, emulsifying and 
suspending agents, preserving agents, sweetening agents or flavoring agents. The 
compositions may be formulated so as to provide rapid, sustained, or delayed release of the 
active ingredients after administration to the patient by employing procedures well known in 
the art. The formulations can also contain substances that dimmish proteolytic degradation 
10 and promote absorption such as, for example, surface active agents. 

The specific dose is calculated according to the approximate body weight or body 
surface area of the patient or the volume of body space to be occupied. The dose will also be 
calculated dependent upon the particular route of administration selected. Such calculations 
can be made without undue experimentation by one skilled in the art. Exact dosages are 
1 5 determined in conjunction with standard dose-response studies. It will be understood that the 
amount of the composition actually administered will be determined by a practitioner, in the 
light of the relevant circumstances including the condition or conditions to be treated, the 
choice of composition to be administered, the age, weight, and response of the individual 
patient, the severity of the patient's symptoms, and the chosen route of administration. Dose 
20 administration can be repeated depending upon the pharmacokinetic parameters of the dosage 
formulation and the route of administration used. 

Replication-competent HCV-pestiviruses are generated by choosing the HCV 
function or sequence element desired to be studied. The HCV sequence can be obtained from 
a plasmid clone of a partial or full HCV genome using PCR to amplify a target region 
25 containing the desired sequence or by restriction enzyme digestion. The HCV fragment is 
then inserted into the desired location of a clone of the pestivirus genome using standard 
techniques. Desired portions of the pestivirus genome may be deleted before or after addition 
of the HCV fragment. The recombinant genome is then transfected into a cell that supports 
replication of the parental pestivirus genome and their ability to replicate using standard 
30 assays. For example, replication can be assessed by virus-induced cytopathic effect; plaque 
formation; detection of viral antigens and/or viral RNA accumulation; and by plaque assay 
measuring released infectious virus. The inventors herein have found that the BVDV RNA 
replication machinery works in many cell types, including bovine, hamster, mouse and human 
cells. It has also been reported that BVDV RNAs can amplify in other cell types including 
35 human hepatoma lines and hepatocytes (Behrens SE, et al., J Virol 1998 Mar;72(3):2364-72). 
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The host cell range for a particular chimera will be dependent upon the properties of that 
chimera as empirically determined. 

As described below, some chimeras do not replicate stably as indicated by 
heterogeneity in the size of plaques produced by the chimeric virus. Upon passage, 
5 pseudorevertants can frequently be isolated that are capable of stable replication. Such 
pseudorevertants will have one or more deletions or base substitutions in the HCV and/or 
pestivirus sequences. Information derived from these gain-of-function mutations can be used 
to define the elements necessary for generating stable, replication-competent chimeras of 
HCV and a pestivirus. 

1 0 The invention provides a method for screening compounds for antiviral activity 

against HCV. The method involves comparing a test compound's effect on replication of a 
chimeric HCV-pestivirus RNA molecule as described above with the compound's effect on 
replication of the parental pestivirus. Compounds which have a greater effect on replication 
of the chimeric virus than the pestivirus are likely directed against the HCV portion of the 

1 5 chimera. Typically, the method is performed by providing duplicate cell cultures containing a 
chimeric viral RNA which is replication-competent in that cell, treating one of the culture 
with the test compound, and then measuring the replication efficiency of the chimeric RNA in 
both cultures. Any effect induced by the compound is compared against the compound's 
effect on replication of the parental pestivirus in cells of the same type. This control assay is 

20 preferably performed at the same time using the same culture conditions. 

The cells used in the screening assay can be prepared by transiently transfecting the 
cells with the desired chimeric RNA molecule as described below. Alternatively, it is 
contemplated that the chimeric RNA molecule can be constitutively expressed in the cell by 
transfecting the cell with a polynucleotide comprising a cDNA of the chimeric RNA operably 

25 linked to a DNA-dependent promoter. The chimeric cDNA may include a selectable marker, 
which would allow for selection of cells expressing the chimeric RNA. It is also envisioned 
the selectable marker could be a dominant marker that allows selection of cells expressing 
chimeras having adaptive mutations or selection of cells permissive for virus replication 
(Frolov et aL, / Virol. 73:3854-3865, 1999). It is also contemplated the cDNA could express 

30 a reporter gene that could be assayed to measure RNA replication. 

Alternatively, chimeric virus particles are incubated with a cell permissive for 
infection by the pestivirus in the presence or absence of the test compound and then 
replication of the chimeric virus is measured and compared to the replication of the parental 
pestivirus incubated with the same cell type in the presence or absence of the test compound. 
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Inhibition of replication can be measured in many ways, including assaying for the 
reduction of virus-induced cytopathic effect; inhibition of plaque formation, reduced 
production of viral antigens as detected by immunofluoresence assay; reduced viral RNA 
accumulation; reduction in released infectious virus from treated and untreated control and 
5 chimera samples using a plaque assay. In addition, it is contemplated that a cell line that is 
designed for pestivirus-specific transactivation of a reporter gene could be used directly or in 
lieu of a plaque assay. The reporter gene is operably linked to a promoter that is activated 
upon infection by the chimeric virus and production of the viral transactivator protein. 

Preferred embodiments of the invention are described in the following examples. 
10 Other embodiments within the scope of the claims herein will be apparent to one skilled in the 
art from consideration of the specification or practice of the invention as disclosed herein. It 
is intended that the specification, together with the examples, be considered exemplary only, 
with the scope and spirit of the invention being indicated by the claims which follow the 
examples. 

15 Example 1 

This example illustrates the construction and analysis of 5' HCV-BVDV chimeras as 
reported in detail in Frolov et al. (RNA 4:1418-1435, 1998) which is incorporated in its 
entirety by reference. A functional clone of BVDV (Mendez et al., J. Virol 72:4737-4145, 
1 998) was used to construct and characterize a series of 5' NTR chimeras with sequences 
20 derived from HCV and the picoma virus, encephalomyocarditis virus (EMCV). The results 
help to define the requirements of a functional BVDV 5 1 NTR and provide replication- 
competent BVDV-HCV chimeras dependent on a functional HCV IRES. 

Example 2 

This example illustrates the construction of chimeras for expressing additional 
25 functional portions of the HCV genome by addition of further HCV sequence downstream 
from the functional or adapted HCV 5*NTR chimeras fused in-frame to the BVDV ORF. 

One such construct (Figure 21) involves fusion of HCV sequences to BVDV 
sequences in the p7 protein coding region (at a convenient BseRI restriction site). Both HCV 
and BVDV encode a p7 protein that is located immediately downstream of the E2 protein. 
30 The p7 protein is a small hydrophobic protein of unknown function. pCBV/p7 consists of the 
first 79 bases of the BVDV 5OTR encoding stem loop structure BT and Bl, followed by the 
entire HCV 5*NTR, the entire HCV structural protein coding region and the first 36 amino 
acids of HCV p7 fused to the C-terminal 31 amino acids of BVDV p7. The fused p7 gene is 
followed by the remainder of the BVDV ORF including the entire nonstructural region and 
35 the BVDV 3' NTR. Transfection of MDBK cells with the RNA corresponding to this 
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sequence (Figure 22) leads to replication of the chimeric RNA and production of the expected 
HCV and BVDV polyprotein cleavage products. Variations on this strategy are envisioned in 
which all or part of the HCV polyprotein and cis elements important for RNA packaging can 
be expressed in viable chimeras. In addition the BVDV replicase regions for either cytopathic 
5 or non-cytopathic pestiviruses (like NADL clns-) can be used. Transfection of cells 

permissive for HCV particle, assembly, release and reinfection with this chimeric RNA can 
be used to make HCV-like particles. These particles and this infection system can be used (i) 
to screen for specific inhibitors of HCV particle, assembly, release and reinfection, (ii) for 
identifying antibodies capable of neutralizing HCV infectivity and (iii) as live or inactivated 
1 0 vaccines. Furthermore, this embodiment of the invention demonstrates that the BVDV RNA 
replication machinery can be used for expression of heterologous RNA and polypeptide 
sequences and can be used as a vehicle for RNA or DNA "genetic" vaccination in which the 
BVDV replicase amplifies the level of antigen expression by cytoplasmic RNA-dependent 
replication. 

15 

Example 3 

This example illustrates chimeric RNAs that are modified to express dominant 
selectable markers, assayable markers or FACS sortable markers. 

Such variants can be used to select for chimeras capable of replication in particular 

20 cell types, or to screen for cell types that are permissive for replication of the chimeric RNA. 
Selectable markers include, but are not limited to, the genes encoding puromycin resistance 
(puromycin N-acetyl transferase; PAC), neomycin resistance, blasticidin resistance, 
hygromycin resistance, etc. Assayable markers include, but are not limited to, the genes 
encoding B-galactosidase, lucifcrase, B-glucuronidase, etc. Easily sortable molecules include 

25 single chain antibodies, cell surface markers, and non-toxic protein markers like green 
fluorescent protein. In a specific example (Figures 23 and 24), the RNA encoded by 
pCBV/p7 was modified to include a cassette at the beginning of the BVDV 3'NTR that is 
comprised of the EMCV IRES driving the gene encoding PAC. This chimeric RNA can 
replicate, expresses PAC and confers resistance to puromycin resistance. This property can 

30 be used to select for variants of the chimera that are capable of noncytopathic replication in 
desired cells type and also provides a means of showing that cells harbor a functional 
chimeric RNA. Desired variants can be identified, cloned and further characterized as 
described in Example 1. Of note, is that this location in the BVDV genome and this strategy 
for expressing heterologous genes may also be applied to using infectious attenuated 
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pestivimses as gene expression vectors and as chimeric live vaccines against other animal 
pathogens. 



Example 4 

5 

This example illustrates the use of the bicistronic strategy as an alternative to the in- 
frame fusions described in Example 2. 

A specific example is shown in Figure 25 and its sequence as Figure 26. In this 
bicistronic chimera, the 5' sequences are identical to that of pCBV/p7 except that the HCV 

10 ORF continues to include the first 246 amino acids of NS4B. The HCV sequence is followed 
by the EMCV IRES fused to BVDV Npro, the N-terminal 10 aa of BVDV C, the C-terminal 
19 aa of C, 9 N-terminal amino acids of Erns, 48 C-terminal amino acids of E2 and the 
remainder of the BVDV NADL ORF and 3' NTR. The constructed BVDV ORF encodes a 
functional BVDV RNA replicase. The deletions in the N-terminal portion of this ORF were 

1 5 designed to preserve proper membrane topology and processing of the replicase. The 

bicistronic chimeric RNA can replicate upon transfection of permissive BVDV host cells. 

Example 5 



20 This example illustrates 3*NTR chimeras. Although initial attempts to recover viable 

chimeric viruses in which the BVDV 3 'NTR was completely replaced by that of HCV were 
unsuccessful, a strategy similar to that detailed in Example 1 has produced chimeras that 
harbor the conserved elements of the HCV 3 r NTR. An initial tandem 3OTR construct was 
made in which the HCV 3OTR was engineered to follow the BVDV ORF. The complete 

25 BVDV 3 'NTR was position 3' to the HCV 3' NTR after a short heterologous sequence. This 
sequence of this parental construct, which replicated poorly, is shown m Figure 19 RNAs 
transcribed from this plasmid were of low specific infectivity suggesting that revertants or 
pseudorevertants might have arisen. Indeed isolation and sequence analysis of several 
independent plaque-forming variants revealed that deletions in the HCV poly U tract of 

30 various lengths had occurred. These revertant sequences are shown in Figure 20. When these 
altered HCV 3 r NTRs were reconstituted into the original tandem 3* NTR parent, they gave 
rise to plaque forming RNA transcripts of high specific infectivity, demonstrating that these 
alterations restored the ability of the chimeric RNA to replicate. Large deletions in the U tract 
gave rise to virus with more robust replication and larger plaques while stably maintaining the 

35 conserved HCV 3 'NTR 98-base element and the polypyrimidine "transition" region. Such 
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chimeric viruses can now be used to screen and evaluate antisense, ribozyme, and other 
therapeutics targeted against this conserved HCV RNA element that is essential for 
replication. 

5 Materials and Methods 

Plasmid Constructs 

pACNR/BVDV NADL was previously described (Mendez et al., 1998, supra). 
pBVDV is a derivative of pACNR/BVDV NADL which contains a G->T transversion at nt 
14994 that creates an Xba I site upstream of the T7 promoter (T. Myers & CM. Rice, 

10 unpubl.). To facilitate construction of the chimeras, subclones were created. First, two 
fragments were isolated by PCR amplification of p90/HCVFLIongpU (Kolykhalov et al., 
Science 277:570-574, 1997) with primers #498 (5 -TGTACATGGCACGTGCCAGCCCC) 
and #498 (5^ATCAACTCX:ATGGTGCACGGTCT) and pBVDV with primers #481 (5'- 
AGACCGTGCACCATGGAGTTGATC) and #482 (5*- 

15 CGTTTCACACATGGATCCCTCCTC). These two fragments were digested with ApaL I 
and ligated to produce a fragment containing a fusion of the HCV 5* NTH to the BVDV ORF. 
This fragment was digested with Sad and ligated into pGEM3Zf(-) which had been digested 
with Sma I and Sac I to produce the subclone pGEM498-Sacl. Next, a fragment containing 
the BVDV 5' NTR was synthesized by PCR amplification of pBVDV with primers #183 (5'- 

20 ttttctagataatacgactcactat^^ 

and #480 (5'<KjGGGCTGGCACGTGCCATGTACA). This fragment was digested with 
Xba I and BsrQ I and ligated into pGEM498-SacI digested with the same two enzymes, to 
create the plasmid pGEMXbal-Sacl. pGemXbal-Sacl contains a tandem fusion of the BVDV 
5' NTR, the HCV 5' NTR, and the 5' portion of the BVDV hT 0 gene. pBVDV + HCV was 

25 created by digesting pGEMXbal-Sacl with Xba I and Sac I and ligating the fragment into 
pB VDV digested with the same two enzymes, and as such pBVDV + HCV contains the T7 
promoter, followed by the entire 385-nt 5* NTR of BVDV, a GT dinucleotide (nt 386-387), 
the entire 341-nt 5' NTR of HCV (nt 388-728), and the sequence of the BVDV NADL strain 
including the ORF and V NTR. Derivatives of pBVDV + HCV containing deletions within 

30 the BVDV 5' NTR and/or the HCV 5' NTR were created in the subclone pGEMXbal-Sacl, as 
described below, prior to ligation into Sba I- and Sac I-digested pBVDV. For making 
deletions, restrictions sites with non-compatible protruding ends were treated with the 
Klenow fragment of DNA polymerase I prior to ligation. For creation of pBVDV + 
HCVdelB3 (deletion of nt 174-374, inclusive), pGEMXbal-Sacl was digested with Afl II and 

35 BsrG I. For pBVDV + HCVdelB2B3 (deletion of nt 67-374), pGEMXbal-Sacl was digested 
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with Avr n and BsrG I. For pBVDV + HCVdelBlB2B3 (deletion of nt 33-374), pGEMXbal- 
Sacl was digested with SnaB I and BsrG I. For pBVDV + HCVdelB2B3Hl (deletion of nt 
67-3396), pGEMXbal-Sacl was digested with Avr II and Xcm I. For pBVDV + 
HCVdelB2B3HlH2 (deletion of nt 67-513), pGEMXbal-Sacl was digested with AVR II and 
5 Bsg I. For pBVDV + HCVdelB2B3H3 (deletion of nt 67-374, 5 1 8-704), subclone 
pGEMXbal-SacidelB2B3 was digested with Sma I. P 5*HCV was created by digesting 
P 90/HCVliongpU with Xba I and Nru I and ligating the fragment into pBVDV + HCV 
digested with the same two enzymes. 

The EMCV plasmid, pEC g , was provided by Ann Palmenberg and is described 
10 elsewhere (Hahn et al., J. Virol 69:2697-2699, 1995). p5'EMCV contains the entire 710 nt of 
the 5' NTR of EMCV, followed by the open reading frame of BVDV and the 3' NTR. One 
extra G residue was added between the T7 promoter and the first nucleotide of the EMCV 5* 
NTR to facilitate efficient in vitro transcription. Convenient restriction sites within the 
BVDV 5' NTR or the EMCV 5' NTR were used to create additional chimeras. Sites with 
1 5 noncompatible protruding ends were treated with the Klenow fragment of DNA polymerase I 
prior to ligation. For example, the plasmid pBVDV + EMCVdelA contains nt 1-378 of 
BVDV 5' NTR fused with nt 45-710 of EMCV (the BsrG I site of BVDV ligated to the £coR 
V site of EMCV), pBVDV + EMCVdelB3A contains nt 1-173 of BVDV fused with nt 45-710 
of EMCV (the Aft H site of BVDV ligated to the EcoR V site of EMCV). pBVDV + 
20 EMCVdelB2B3A contains nt 1-66 of BVDV fused with nt 45-710 of EMCV (the Avr II site 
of BVDV ligated to the EcoR V site of EMCV). pBVDV + EMCVdelB3ABC contains nt 1- 
173 of BVDV fused with nt 161-710 of EMCV (the Afl II site of BVDV ligated to the 
Psp\A05 site of EMCV). pBVDV + EMCVdelB2B3ABC nt 1-66 of BVDV fused with nt 
161-710 of EMCV (the Avr II site of BVDV ligated to the PsplAM site of EMCV). pBVDV 
25 + EMCVdelB3A-H contains nt 1-101 of BVDV fused with nt 289-710 of EMCV (the Nhe I 
site of BVDV ligated to the Avr II site of EMCV). pBVDV + EMCVdelB2B3A-H contains 
nt 1-62 of BVDV fused with nt 289-710 of EMCV (the Avr II site of BVDV ligated to the Avr 
II site of EMCV). The schematics of the chimeric 5' NTRs are presented in Figures 2 and 4. 
All other heterologous 5' NTRs used in the study were generated by PCR using an 
30 oligonucleotide complementary to nt256-272 of the HCV 5' NTR and primers containing the 
sequence of the Xba I restriction site followed by the T7 promoter, the heterologous 
sequences found in sequenced pseudorevertants, or sequences corresponding to different 
regions of the HCV 5' NTR. All the fragments were subcloned into the plasmid, pRS2 (a 
derivative of pUC19), sequenced, and recloned into the p5*HCV plasmid by replacing the 
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fragment between the XBa I site located upstream of the T7 promoter and the Nhe 1 site (nt 
249-254) in the 5' NTR of HCV. 
Cell cultures 

MDBK cells were obtained from M. Collett (ViroPharma, Inc.) and BT cells were 
5 obtained from the American Type Culture Collection (Rockville, Maryland). Cells were 

grown in Dulbecco's modified Eagle medium (D-MEM) supplemented with 10% horse serum 
and sodium pyruvate. 
Transcriptions and transfections 

All the designed plasmids , including pBVDV and the chimeric derivatives, were 

10 digested to completion with Sda I (5fre83871), purified by phenol extraction, precipitated by 
ethanol, and dissolved in water. The transcription reactions were performed sin the T7 
Megascript kit (AMBION) using the conditions recommended by the manufacturer. 
Reactions were incubated at 37°C for 1 h, and 3 H-UTP was added to the reaction to quantify 
the RNA synthesis. The quality of the synthesized RNAs was checked by agarose gel 

1 5 electrophoresis, and samples containing 50-60% of full-length RNA were used for 

electroporations and in vitro translations. The reaction mixtures were aliquoted and stored at 
-70°C prior to electroporation or in vitro translations. 

Transfection was performed by electroporation of MDBK cells using previously 
described conditions (Mendez et al., 1998, supra). Two micrograms of in vitro synthesized 

20 RNA, corresponding to approximately 1 p, g of the full-length transcript, were used per 
electroporation. In standard experiments, ten-fold dilutions of electroporated cells were 
seeded in 6-well tissue culture plates containing 5 x 10 5 naive MDBK cells per well. After 1 
h of incubation at 37°C in an 5% C0 2 incubator, cells were overlaid with 3 ml of 0.6% LE 
Sea Kem agarose (FMC Bioproducts) containing minimal essential medium supplemented 

25 with 5% horse serum. Plaques were stained with crystal violet after 3 days incubation at 
37°C. The rest of the transfected cells was seeded into 100-mm dishes and incubated for 
approximately 48 h or until cytopathic effect was observed in virtually all cells. Samples of 
the media were taken at 24 and 48 h, and virus titers were determined as described above and 
previously (Mendez et al., 1998, supra). 

30 Analysis of the 5' ends of viral genomes 

Sequencing of the 5' ends of selected variants of BVDV was performed on plaque- 
purified viruses. Plaques were typically isolated from the agarose overlay without staining 
with neutral red. Virus was eluted in 1 ml of D-MEM/10% horse serum for several hours and 
was used to infect 5 x 10 $ MDBK cells in 35-mm dishes. After 1 h of virus adsorption of 37 
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10 



°C, an additional 1 ml of D-MEM/10% horse serum was added to the dishes, and incubation 
was continued for 36-48 h until cytopathic effect was observed in virtually all cells. 

Fifty microliters of harvested viral stocks were clarified by low speed centrifugation, 
and viral RNAs were isolated by TRIzol reagent (Gibco-BRL) using the protocol 
recommended by the manufacturer. Sequencing of the 5' termini was performed using an 
oligonucleotide/cDNA-ligation strategy described elsewhere (Troutt et al„ Proc. Natl Acad. 
ScL USA 59:9823-9825, 1992). The primer SI (5'-GTCGTTTCACACATGGATCC), 
complementary to nt 710-729 of the BVDV genome, was used for cDNA synthesis. A 
phosphorylated oligonucleotide tag (5'-GACTGTTGTGGCCTGCAGGGCCGAATT) with an 
amino group on the 3' terminus was ligated to the first strand cDNA (Troutt et al., 1992, 
supra). One tenth of this reaction mixture was used for PCR amplification. The primers for 
PCR amplification were as follows: primer A (S'-GCCCTGCAGGCCACAACAGTC), 
complementary to the tag; primer B (S'-TCAGGCAGTACCACAA) complementary to nt 
281-296 of the HCV 5' NTR; and primer C (5^GGAATGCTCGTCAAGAAGACAG), 
15 complementary to nt 268-289 of the EMCV 5' NTR. The primer pairs of A + B or A + C 
were used for analysis of the pseudorevertants of 5'HCV and BVDV + HCVdelBlB2B3 or 
5'EMCV, respectively. For the 5'HCV pseudorevertants, one tenth of the ligation mixture 
was used for an additional PCR reaction. This fragment was synthesized using primer SI, 
describe above, and a primer corresponding to nt 147-175 of the HCV genome. Fragments 
20 were purified by agarose gel electrophoresis and cloned into the plasmid pRS2. Multiple 
independent clones were sequenced by the standard dideoxy-mediated chain termination 
methods using the Sequenase version 2.0 DNA Sequencing Kit (USB). 
Cell-free translation 

Cell-free translation reactions were performed in reticulocyte extracts (Promega) 
25 using conditions recommended by the manufacture. Usually 0.1-1 p.g of the same in vitro 
synthesized RNAs used in transfection experiments were used in 25 ^1 translation reactions. 
After 45 min of incubation at 30 °C, 2 \x\ were dissolved in 10 \xl of sample buffer, and those 
samples were analyzed by sodium dodecyl sulfate PAGE. Labeled proteins were visualized 
by autoradiography of the dried gel. The efficiency of translation was measured using 
30 phosphorimager analysis (Molecular Dynamics) by comparing the radioactivity in the band 
corresponding to the N 1 * 0 protein. In preliminary experiments, an eightfold increase in 
incorporation was observed for translation of 4 ^g versus 0.4 ng BVDV transcript RNA. 
Quantitative data were obtained from reactions using subsaturating (0.4 ^g) amounts of 
BVDV or BVDV chimera transcript RNAs. 
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Analysis of virus specific RNAs 

The protocols used for radioactive labeling of virus-specific RNAs are described in 
the appropriate figure legends. RNAs were isolated from the cells by using TRIzol reagent as 
recommended by the manufacturer (Gibco-BRL). After denaturation with glyoxal in 
5 dimethylsulfoxide, cellular RNAs were analyzed by electrophoresis in a 1% agarose gel 
containing a 10 mM phosphate buffer. Pieces of the dried gel containing the appropriate 
RNA bands were excised, and their radioactivity measured by liquid scintillation counting. 

Results 

1 0 Features of the BVDV, HCV, and EMCV 5 f NTRs important for chimera design 

Schematic representations of the proposed secondary structures of the 5' NTRs of 
HCV, BVDV, and EMCV are shown, and the location of each IRES is indicated in Figure 1. 
EMCV is a member of the cardiovirus genus within the family Picornaviridae. While not a 
member of the Flaviviridae, EMCV is similar to HCV and BVDV in that it is a positive- 

1 5 strand RNA virus shown to contain an IRES within its 5' NTR (Jang et al. t J. virol 62:2636- 
2643, 1 988). Based on their proposed secondary structures, the HCV IRES and the BVDV 
IRES have been classified as type 3 IRESs, while the EMCV IRES is classified as a type 2 
IRES (Lemon & Honda, Siemin. Virol 5:274-288, 1997). However, these three IRESs as 
well as IRESs from other members of the Flaviviridae and the Picornaviridae have been 

20 proposed to contain a common structural core (Leetal., Virus Genes 72:135-147, 1996). 

The model for the secondary structure of the 341-nt HCV 5' NTR has been refined by 
enzymatic and chemical analysis of synthetic transcripts (Brown et al., Nucl Acids. Res. 
20:5041-5045, 1992; Wang et al., J. Virol <tf:730 1-7307, 1994; Honda et al., RNA 2:955-968, 
1996; Lima et al., 1997). This element contains four discreet hairpins (referred to here as HI, 

25 H2, H3 and H4) and a pseudoknot at the base of hairpin H3 (Wang et al, 1995). The 

secondary structure of the 385-nt BVDV 5' NTR has not been as extensively studied, but is 
proposed to be similar to that of HCV (Brown et al., 1992) with four discrete hairpins 
(referred to here as B l\ B 1 , B2, and B3) and a pseudoknot at the base of B3 (Rijnbrand et al., 
1997). The secondary structure of the longer (>700 nt) EMCV 5' NTR consists of a series of 

30 hairpins A-M (Duke et al., 1992; Hoffman & Palmenberg, 1996). Recently, a revised model 
of the EMCV 5' NTR suggests moderately different secondary structures for the C and G 
subregions, and significantly different secondary structures for the I-M subregion 
(Palmenberg & Sgro, 1997). 

For HCV, HI is nonessential for IRES function (Reynolds et al., 1995; Rijnbrand et 

35 al., 1 995; Honda et al., 1 996b; Reynolds et al., 1 996; Kamoshita et al., 1 997) and its deletion 
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has actually increased translation efficiency in some analyses (Rijnbrand et al., 1995; Honda 
et al., 1996b). Most studies have found that hairpin H2 and H3 and the pseudoknot are 
essential for IRES function (Wang et al, 1993; Rijnbrand et al., 1995; Honda et al, 1996b). 
However, two studies indicate that H2 may not be essential (Tsukiyama-Kohara et al., 1992; 

5 Urabe et al., 1997). The 3' boundary of the HCV IRES is more controversial. The IRES 
clearly extends to the AUG initiation codon. However, some studies indicate that sequences 
affecting the efficiency of translation initiation extend into the ORF (Reynolds et al., 1995; 
Honda et al., 1996a; Honda et al., 1996b; Lu & Wimmer, 1996). By analogy to the HCV 
IRES and the related pestivirus CSFV IRES, the BVDV IRES probably requires hairpins B2 

10 and B3 and the pseudoknot for function, with Bl' and B 1 probably not required for IRES 
activity (Poole et al., 1995; Rijnbrand et al., 1997). For EMCV, hairpins H-L have been 
shown to be required for IRES function in mono- or dicistronic constructs (Jang & Wimmer, 
1990; Duke et al., 1992). The remaining portion of the EMCV 5' NTR is thought to be 
required for RNA replication or unknown steps in viral replication that are important for 

1 5 pathogenesis (Duke et al., 1 990; Martin & Palmenberg, 1 996). 



Replacement of the BVDV 5 f NTR with the HCV 5' NTR results in a large decrease in 
specific infectivity 

Since the BVDV 5' NTR and the HCV 5' NTR are proposed to have similar RNA 
20 secondary structure and functional organization, an experiment was performed to test whether 
the BVDV 5* NTR could be replaced by the HCV 5' NTR. p5' HCV has an exact replacement 
of the BVDV 5* NTR with that of HCV (Fig. 2A) while the coding sequence and 3' NTR of 
p5*HCV are identical to pBVDV. Positioning of the HCV 5* NTR in such a manner was 
necessary since translation initiation from the HCV IRES begins at or near the AUG start 
25 codon (Honda et al., 1996a; Reynolds et al., 1995; Reynolds et al., 1996; Rijnbrand et al., 
1996). The specific infectivity of 5T1CV RNA synthesized in vitro was compared to that of 
BVDV RNA by tiansfection of MDBK (bovine kidney) cells (Fig. 2A). The specific 
infectivity of BVDV RNA was approximately 4 x 10 6 plaque forming units (PFU)/^g RNA. 
In contrast, the specific infectivity of 5* HCV RNA was near the limit of detection (30-50 
30 PFU/jig RNA) and considerable plaque heterogeneity was apparent. These results suggested 
that the HCV 5' NTR replacement chimera might be incapable of efficient replication and 
plaque formation and that the plaque forming virus observed had arisen by secondary 
mutation(s). Sequence analysis of plaque-purified 5* HCV viruses presented below confirmed 
that the replicating pool of virus contained such pseudorevertants. 
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Next, the in vitro translation efficiency of these two RNAs in rabbit reticulocyte 
extracts was analyzed to test whether the defect in specific infectivity of 5* HCV RNA could 
be attributed to lower translation efficiency. Although the specific infectivity of 5' HCV RNA 
was reduced -5 logs compared to BVDV RNA, its translation efficiency was only slightly 
5 reduced, -twofold (Fig. 3, lane 1 vs. lane 2). The apparent size of the N-terminal cleavage 
product, N* 10 , was identical for both RNAs, suggesting that translation initiated with the 
correct AUG. These data are consistent with the hypothesis that the BVDV 5' NTR contains 
signals that are required for a step in replication other than translation which are not present in 
the 5' HCV chimera. 

1 0 Given the low specific infectivity of 5' HCV RNA, an experiment was performed to 

test the effect of placing the BVDV 5' NTR sequence upstream of the HCV 5* NTR, resulting 
in tandem BVDV and HCV 5* NTRs (called BVDV + HCV). This arrangement actually 
decreased translation efficiency (Fig. 3, lane 14 vs. lane 1) yet restored infectivity (Fig. 2A). 
The plaques produced by BVDV + HCV were also heterogeneous in size, indicating that this 

1 5 virus was unstable. Upon passage, RT-PCR analysis indicated that pseudorevertants had 

indeed arisen in which portions of the BVDV and/or HCV 5' NTRs had been deleted (data not 
shown). These data show that sequences in the BVDV 5' NTR required for virus replication 
can function when placed upstream of a functional HCV IRES driving translation of the 
BVDV polyprotein. 

20 

Hairpins Br and Bl in conjunction with the HCV IRES are sufficient for stable and 
efficient BVDV replication 

The sequences within the BVDV 5' NTR that restored replication in the context of the 
HCV 5* NTR were mapped using three deletion variants. The deletion BVDV + HCVdelB3 

25 removed a large portion of hairpin B3; the deletion within BVDV + HCVdelB2B3 removed 
hairpins B2 and B3, and the deletion within BVDV + HCVdelBlB2B3 removed hairpins Bl, 
B2 and B3. The specific infectivities of RNAs from these deletion mutants were near that of 
BVDV RNA (Fig. 2). Upon passage of these viruses, RT-PCR analyses and sequencing 
indicated that BVDV + HCV delB3 and BVDV + HCVdelB2B3 were stably propagated and 

30 produced homogeneous plaques slightly smaller than those of wild-type BVDV (data not 
shown). In contrast, BVDV + HCVdelBlB2B3 produced smaller heterogeneous plaques. 
Reverse transcription-polymerase chain reaction (RT-PCR) analysis and sequencing indicated 
that BVDV + HCVdelBlB2B3 underwent a reversion event described in more detail below. 
The translation efficiencies of these three RNAs (Fig. 3, lanes 9, 10, and 12) were similar to 

35 BVDV + HCV RNA (Fig. 3, lane 14), indicating that the deleted portions (hairpins Bl, B2, 
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and B3) are not required for translation in the BVDV + HCV chimera. These results show 
that BT and Bl are the minimal elements sufficient for stable replication in conjunction with 
the HCV 5' NTR. 

Having shown that BT and Bl are sufficient for replication in conjunction with the 
5 HCV 5' NTR, we next conducted a deletion analysis to determine the sequences within the 
HCV 5' NTR of BVDV + HCV delB2B3 required for replication. A large portion of HI was 
deleted in BVDV + HCV delB2B3Hl, while both HI and H2 were deleted in BVDV + HCV 
delB2B3HlH2. Of these two RNAs, only BVDV + HCV delB2B3Hl was as infectious as 
parental BVDV RNA (Fig. 2B). However, the BVDV + HCV delB2B3Hl virus produced 

1 0 smaller plaques than BVDV + HCV delB2B3, indicating that hairpin H 1 may augment 
replication of the chimera. In contrast, BVDV + HCV delB2B3HlH2 RNA was not 
infectious (Fig. 2B) and was translated poorly (Fig. 3, lane 1 1). Diminished HCV IRES 
activity might be due to deletion of hairpin H2 or juxtaposition of BVDV hairpins BT and Bl 
with H3. A third derivative of BVDV + HCV delB2B3, with a Sma l-Sma I deletion 

1 5 abrogating HCV IRES function by removing H3, was also not infectious (data not shown). 
Thus, a 5' NTR consisting of Bl' and Bl and a functional HCV IRES is sufficient for stable 
BVDV replication in MDBK cells. Similar results were obtained in BT cells, another BVDV- 
permissive continuous bovine cell line (data not shown). 

20 Replacement of the BVDV 5* NTR with the EMCV 5' NTR 

The following experiment was performed to determine whether the BVDV 5' NTR 
could be replaced by the 5' NTR of a more phylogenetically distant virus, EMCV. A 
derivative of BVDV was created, called 5' EMCV, that contains an exact replacement of the 
BVDV 5' NTR with the EMCV 5' NTR plus an additional guanosine residue at the 5' terminus 

25 for more efficient transcription initiation of T7 polymerase (Fig. 4A). The specific infectivity 
of 5' EMCV RNA was more than three orders of magnitude lower than BVDV RNA, 
indicating that it was defective for replication, although its specific infectivity was higher than 
that of 5* HCV RNA (compare Figs. 4A and 2A). Similar to 5' HCV, 5' EMCV produced 
heterogeneous plaques, and sequence analysis indicated that pseudorevertants had arisen. The 

30 lower specific infectivity of 5' EMCV RNA was not likely because of a defect in translation, 
since the translation efficiency of 5' EMCV RNA was about threefold higher in vitro than that 
of BVDV RNA (Fig. 3, lane 20 vs. lane 19). 

Similar to BVDV + HCV, it was also determined whether the BVDV 5' NTR at the 5' 
end of the 5' EMCV RNA would increase its specific infectivity. BVDV + EMCVdelA (Fig. 

3 5 4 A) contained the entire BVDV 5* NTR in tandem with the EMCV 5' NTR lacking a portion 
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of hairpin A. BVDV + EMCVdelA RNA had a specific infectivity near that of BDVD RNA 
(compare Figs. 4A and 2A) despite having a lower translation efficiency than 5' EMCV (Fig. 
3, lane 21 vs. lane 20). Similar to the results with BVDV 4- HCV, this implicates the added 
BVDV 5* NTR sequence for a step in viral replication other than translation. Two derivatives 
5 of BVDV + EMCVdelA that contain deletions of portions of the BDVD 5' NTR but maintain 
the sequence of Bl 1 and Bl, BDVD + EMCVdelB3A and BVDV + EMCVdelB2B3A (Fig. 
4 A), also were infectious. These derivatives had translation efficiencies near that of the 
parental BVDV + EMCVdelA (Fig. 3, compare lanes 15 and 16 with lane 21). This 
demonstrated that hairpins BT and Bl were sufficient for replication in conjunction with a 

1 0 large portion of the EMCV 5' NTR. Derivatives of BVDV + EMCVdelB3 A or BVDV + 
EMCVdelB2B3A that contain further deletions of EMCV (BVDV _ EMCVdelB3ABC and 
BVDV + EMCVdelB2B3ABC in particular) were translated efficiently (Fig. 3, lanes 17 and 
1 8) and were infectious (Fig. 4B). This indicates that the chimeras did not require putative 
EMCV RNA replication signals (Martin & Palmenberg, 1996). However, derivatives with 

1 5 deletions extending into the canonical EMCV IRES were not infectious. For example, BVDV 
+ EMCVdelB3A-H and BVDV + EMCVdelB2B3A-H, in which a portion of hairpin H is 
deleted, were not infectious (Fig. 4B) and were inefficiently translated in vitro (Fig. 3, lanes 
22 and 23). It should be noted that all of the BVDV + EMCV chimeras produced plaques of 
heterogeneous size, indicating some instability. 

20 

Relatively simple 5' NTR mutations are observed in adapted pseudorevertants 

As mentioned previously, BVDV + HCVdelBlB2B3 did not replicate stably as 
indicated by the heterogeneity in the size of plaques produced by this virus. Upon passage 
and selection of medium plaque-producing variants, 5' RACE analysis and sequencing 

25 indicated that nt 1-26 had been deleted in the pseudorevertants, removing a large portion of 
BT which was apparently deleterious in the absence of Bl. This deletion results in the 5' 
terminal sequence 5'GUAUCG which is identical to the first six bases of BVDV genome 
RNA (Fig. 5) and is repeated at positions 27-32. 

Analysis of the passaged 5' EMCV vims indicated that the replicating progeny had 

30 also undergone a simple deletion of sequence at the 5* end to generate more efficiently 

replicating variants (Fig. 5). After electroporation, the 5' EMCV virus pool was passaged 5 
times at a multiplicity of infection of 0.1-1 PFU/cell on MDBK or BT cells, and the 5' termini 
of three randomly picked plaques were sequenced. For all three plaques selected, nt 2-209 
had been deleted, again creating a genome RNA with the 5* terminal tetranucleotide sequence 

35 5'-GUAU. 



BNSOOCID: <WO. 9955366A 1 J_> 



WO 99/55366 PCT/US99/08850 

27 

Analysis of the 5' HCV progeny indicated that more complicated variants had arisen. 
Most small plaque-producing variants were unstable and quickly reverted to medium plaque- 
producing variants. However, one small plaque-producing variant and two stable medium 
plaque-producing variants were isolated. 5' terminal sequences of the variants were amplified 
5 by rapid amplification of cDNA ends (RACE) and cloned into a plasmid vector, and 

sequences for several independent colonies were determined. The sequence of three clones of 
the small plaque-producing virus (5'HCV.Rl) contained a deletion of HCV sequence from nt 
1-34 and an addition of the dinucleotides 5-AU in two clones and 5-GU in the third clone. 
This creates a 5' terminus of 5 f -(G/A) UAA (Fig. 5B), reminiscent of the first three bases of 

10 the BVDV genome RNA (5 -GUA). Both medium plaque variants appeared to have arisen by 
RNA recombination with non-viral sequences (Fig. 5). One medium plaque variant (5' 
HCV.R2) had deleted the first 21 bases of the HCV sequence and contained instead a 
heterologous sequence of 22 bases. BLAST searches revealed a perfect match between this 
sequence and a sequence in a human retina cDNA of unknown function (Tsp509I). The 

1 5 second medium plaque variant (5* HCV.R3) had also undergone a possible recombination 
event leading to the addition of 12 nt to the 5' end of the HCV sequence. Given its short 
length, multiple matches were found in the database with this sequence. As for the small 
plaque variant, sequencing of multiple clones revealed heterogeneity oat the extreme 5' end, 
with either G of A identified as the 5* base. Remarkably, for both medium plaque variants, 

20 the fused heterologous sequence began with the tetranucelotide sequence 5'-(G/A) UAU (Fig. 
5B). For all three variants, sequencing of the entire 5' NTR and a portion of the N 1 " 0 coding 
region revealed only these changes at the 5' termini. 

5' NTR sequence changes are sufficient for the pseudorevertant phenotypes 

25 To assess the importance of these alterations oat the 5* terminus of the 5* HCV 

pseudorevertants, derivatives of 5 f HCV were created with the changes determined by 5 1 
RACE (Fig. 6A) and analyzed the specific infectivities of these RNAs (Fig. 6B). 
Corresponding to the small plaque variant, a derivative called 5* HCV.R1 orig was engineered 
which contained a 5' NTR consisting of the dinucleotide 5' -GU at the 5' terminus of HCV nt 

30 35-34 1 . This results in a 5' terminus consisting of 5*-GUAA. 5'HCV.Rl orig RNA had a 
specific infectivity at least four orders of magnitude higher than 5' HCV RNA (Figs. 6B and 
2A). This demonstrates that this 5' NTR structure is sufficient for phenotypic reversion to 
high specific infectivity. However, small plaques and considerable heterogeneity were 
observed for 5'HCV.Rl orig suggesting that additional mutations may be present in the 

35 original small plaque variant. 
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The engineered derivative 5'HCV.R2orig had a 5' NTR consisting of 22 nt of 
Tsp509I-homologous sequence followed by HCV nt 22-341 . Another construct, called 
5'HCV.R3orig was made, which has the 12 nt of the other heterologous sequence fused to the 
intact HCV 5' NTR. Specific infectivities for both these derivatives were essentially the same 
5 as observed for wild type BVDV RNA (2-4 x 10 6 PFU/^g; Fig. 6B). Transfection with these 
transcripts produced medium plaques, as observed for the original variants, and this 
phenotype was stable upon passaging. These results show that the altered 5*NTR sequences 
were responsible for the pseudorevertant phenotypes rather than changes elsewhere in their 
genomes. 



Addition of the tetranucleotide sequence S'-GUAU to the HCV 5* NTR allows efficient 
15 BVDV replication 

For all three 5' HCV variants studied, as well as the BVDV + HCV delBlB2B3 and 
5'EMCV pseudorevertants, 5' NTR alterations seemed to involve creation of a three- or four- 
base "consensus" sequence identical to the 5* terminus of BVDV genome RNA. To test the 
importance of this sequence, as opposed to fused heterologous sequences, we created a set of 

20 variants with the BVDV 5' tetranucleotide sequence linked to the HCV 5* NTR or the 
deletion/recombinant break points identified during sequence analysis of the 5* HCV 
pseudorevertants (Fig. 6A). 5* HCV.Rlcons had the tetranucleotide sequence 5-GUAU fused 
to HCV nt 35-341 . 5*HCV.R2cons had the S'-GUAU tetranucleotide sequence fused to HCV 
nt 22-34 1 . 5'HCV.R3cons contained the tetranucleotide sequence 5-Guau fused to the intact 

25 5* terminus of the HCV NTR. RNAs from all three of these derivatives had specific 
infectivities more than five orders of magnitude higher than 5'HCV and comparable to 
parental BVDV (Fig. 6B). 

There were, however, significant differences between the phenotypes of some of 
these derivatives versus the reconstructed pseudorevertants. As mentioned above, 

30 5'HCV.Rlorig yielded tiny and small plaques and produced low virus yields even after 48 h. 
In contrast, the addition of four bases rather than two bases (S'-GUAU vs. S'-GU) yielded 
vims with near wild-type plaque morphology (Fig. 6B) and growth Rates (Fig. 7). In the case 
of the smaller deletion, 5*HCV.R2orig and 5'HCV.R2cons were indistinguishable, suggesting 
that, other than the 5' four bases, the fused heterologous sequences were dispensable. This 

35 was not he case, however, for the chimera containing the S'-GUAU tetranucleotide sequence 
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fused to the intact HCV 5* NTR. 5'HCV.R3cons produced small plaques (Fig. 6B) and grew 
more slowly than 5'HCV.R3orig (Fig. 7) suggesting that the sequence/structure of the 
sequences downstream of the 5' four bases can affect replication efficiency. 



5 The tetranucleotide sequence 5'-GUAU is important for efficient BVDV RNA 
accumulation 

Next, the effects of the different 5' termini on virus-specific RNA accumulation 
directly after transfection were analyzed. This allowed a direct comparison between 5HCV 
and the reconstructed pseudorevertants as well as selected BVDV + HCV deletion constructs. 

10 MDBK cells were transfected with in vitro synthesized RNAs and labeled for 10 h beginning 
at 5 h post-transfection with ^-UTP in the presence of actinomycin D (Fig. 8). RNA 
replication of the 5 1 HCV chimera was severely impaired to a level below detection (Fig. 8, 
lane 2). In contrast, every 5' NTR alteration of 5' HCV that increased RNA specific 
infectivity and allowed efficient virus growth led to readily detectable viral RNA 

1 5 accumulation. Addition of B 1' and B 1 to the 5 f terminus of the HCV 5* NTR restored RNA 
replication to a level -50% of that observed for BVDV (BVDV + HCVdelB2B3; Fig. 8, lane 
3 vs. lane 1). BVDV + HCVdelB2B3Hl displayed reduced RNA synthesis compared to 
BVDV + HCVdelB2B3 (Fig. 8, lane 4 vs. lane 3) perhaps explaining its small plaque 
phenotype and suggesting a possible positive role for HI in replication of this chimera. 

20 5'HCV.Rl orig, which had exhibited plaque heterogeneity and slow growth, accumulated less 
RNA when compared to S'HCV.Rlcons (Fig. 8, lane 5 vs. lane 6). 5'HCV.R2orig and 
S'HCV.RZcons showed similar RNA accumulation (Fig. 8, lane 9 vs. lane 10) consistent with 
their medium plaque phenotypes; and SUCV.RBcons exhibited reduced RNA synthesis 
compared to 5HCV.R3orig (Fig. 8, lane 8 vs. lane 7), consistent with their small-versus 

25 medium-plaque phenotypes. 

Although these RNA phenotypes are complex, the most striking result is that addition 
of the BT Bl hairpins, addition of heterologous 5' sequences terminating with S'-GUAU or 
simply fusion of this tetranucleotide sequence with the HCV 5 1 NTR or short 5* truncations of 
the HCV 5* NTR all dramatically upregulated RNA accumulation. This occurred without 

30 increasing translation efficiency, at least as measured in a cell-free assay (Fig. 3, compare 
lanes 3-8 to lane 1), suggesting that these sequences function at the level of RNA replication 
or stability. 
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Discussion 

The work presented here helps to define the requirements for a fiinctional BVDV 
S'KYK, The BVDV-specific 5* NTR sequences required for efficient replication in cell 
culture are minimal and consist of the 5' terminal sequence, 5-GUAU. The sequence 5'- 
5 AUAU, detected for some pseudorevertants, may also be functional but this was not tested for 
technical reasons. This simple S'-terminal tetranucleotide sequence, which is conserved 
among pestivirses (Ruggli et al., 1996; Becher et al., 1998), was shown to function in the 
context of functional IRES elements derived from the hepacivirus HCV or the picomavinis 
EMCV. As discussed below, this may indicate that the 5* signals required for BVDV RNA 

1 0 replication are rather simple or that elements in these heterologous IRESs can functionally 
replace deleted BVDV sequences. 

Sequences at the extreme 5* end of BVDV genome RNA could modulate the 
efficiency of RNA accumulation by affecting RNA stability, translation, promoter efficiency, 
or some combination of these processes. At this time, we can not distinguish among these 

1 5 possibilities but favor an effect on RNA replication. The complement of the BVDV 5* 

sequence at the 3' end of the negative-strand RNA presumably functions in the initiation of 
positive-strand RNA synthesis. Thus, AUAC-3' at the 3'terminus fo minus-strand RNA may 
be important for positive-strand RNA synthesis. Interestingly, for some positive-strand RNA 
viruses such as rubella virus (Pugachev & Frey, 1998), flock house virus (Ball, 1994) and 

20 turnip crinkle virus (Guan et al., 1997), only minimal cis-acting sequences at the 3' termini of 
negative-strand RNAs are required positive-strand RNA synthesis. In contrast to the 5' NTR 
replacements, we were unable to generate replication-competent BVDV-HCV replacing that 
of BVDV (data not shown). This may indicate that the signals within the pesti virus 3' NTR 
required for initiation of negative-strand RNA synthesis are more complex and vims specific. 

25 Once the replication complex has assembled at the 3* NTR and transversed the RNA during 
negative-strand synthesis, the requirements of the 5' NTR for initiation of positive-strand 
synthesis may be minimal. 

Although the RNA replication signals within the 5* NTR appear to be rather simple, it 
is possible that the signals important for RNA replication actually extend into the IRES and 

30 are more complicated. For instance, the 5HCV pseudorevertants were more stable and grew 
to higher titers than the 5*EMCV counterparts, despite the fact that the 5'EMCV RNAs were 
translated more efficiently in vitro. This may indicate that the BVDV and HCV IRESs 
contain signals important for RNA synthesis that are absent in the EMCV IRES. 

It is perhaps not surprising that 5' HCV appeared to recombine with cellular mRNAs 

35 to acquire a 5* terminus with the 5' -(G/A) UAU consensus, given that non-cytopathic strains 
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of BVDV can recombine with BVDV RNA or cellular mRNAs to generate cytopathic strains 
of BVDV (Meyers & Thiel, 1996). Presumably, this recombination event involves template 
switching during negative-strand RNA synthesis, as observed for polio-virus (Kirkegaard & 
Baltimore, 1986). In contrast to 5 1 HCV, simple deletions of 5* terminal viral sequences could 
5 account for the BVDV + HCVdelBlB2B3 and 5'EMCV pseudorevertants since the 

tetranucleotide sequence is present in these 5 1 NTRs upstream of functional IRES elements. 
Such deletions could occur by partial degradation of positive-strand template prior to 
negative-strand synthesis, by premature termination during negative-strand RNA synthesis, or 
by degradation of 3* terminal negative-strand sequence after synthesis. It is proposed that 

10 5HCV was forced to recombine with cellular sequences because HCV does not have an 5'- 
(G/A) UAU sequence upstream of its IRES. The first occurrence of an (G/A)UAUA 
tetranucleotide sequence is at nt 94-97 within hairpin H2, and a 5 1 deletion extending into this 
sequence would presumably inactivate or severely impair HCV IRES activity. It is interesting 
that BVDV + HCVdelBlB2B3 and 5'EMCV pseudorevertants were generated at much higher 

1 5 frequency than 5'HCV pseudorevertants. This may indicate that recombination between 

BVDV and cellular RNAs is a rare event compared to the processes which lead to deletion of 
terminal viral sequences. 

Poliovirus chimeras dependent upon a functional HCV IRES have been reported (Lu 
& Wimmer, 1996). Interestingly, viable poliovirus chimeras were produced only when HCV 

20 sequences included both the IRES and the N-terminal portion of the HCV ORF. Nucleotide 
sequences or structures in the downstream ORF can modulate HCV IRES translational 
efficiency (see Reynolds et al., 1995; Honda et al., 1996a) but it was also suggested that the 
N-terminal portion of the HCV core polypeptide might be involved. In the case of our 5' 
HCV pseudorevertants, there is no requirement for HCV C protein sequences. Although the 

25 translation efficiency of the HCV IRES in the presence of additional HCV sequences 3' to the 
AUG start was not directly assessed, the HCV chimeras and pseudorevertants were 
translationally active and infectious in the absence of any portion of the HCV ORF. This 
indicates that either the HCV IRES does not extend into the HCV ORF or that the BVDV 
ORF contains analogous sequence which functions in our 5'HCV chimeras. There is some 

30 limited identity between HCV and BVDV within this region. For example, HCV nt 359-394 
and BVDV nt 405-440 are identical at 21 of 36 positions, although identity within this 
sequence may be attributed to a high adenosine content It is interesting to note that the 
luciferase (LUC) and chloramphenicol acetyl transferase (CAT) reporter genes previously 
used to detect HCV IRES activity (Tsukiyama-Kohara et al., 1992; Wang et al., 1993) also 

35 have adenosine- or purine-rich regions in relatively the same position as the HCV ORF and 
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BVDV ORF. It this region is indeed important for IRES activity, this may explain why some 
have observed that the HCV IRES does not require a portion of the HCV ORF for translation 
of CAT or LUC (Tsukiyama-Kohara et al., 1992; Wang et al., 1993). Point mutations and 
insertions within this region of HCV have been shown to reduce HCV IRES activity in vitro 
5 (Honda etal., 1996a,b). 

Despite the fact that B 1 1 and B 1 are conserved among different strains of BVDV and 
similar hairpins are present in border disease virus and CSFV (Deng & Brock, 1993; Becher 
et ah, 1 998), B 1 ' and B 1 were dispensable for BVDV replication, provided that the 5' 
tetranucleotide sequence 5*-(G/A)UAU remained. This may indicate a role for BT and Bl in 

1 0 viral replication in vivo that we do not observe in cell culture. It will be interesting to test the 
phenotype of chimeras that lack Bl' and Bl in vivo to determine if they are attenuated and 
might serve as useful BVDV vaccines. In this vein, several studies with flaviviruses have 
demonstrated that alterations in 5' NTR or 3' NTR elements can lead to attenuation in vivo 
(Cahour et aL, 1995; Men et a., 1996; Mandl et al., 1998). BVDV chimeras that utilize the 

1 5 HCV or EMCV IRES may also prove to be attenuated simply due to the presence of the 

heterologous IRES. For poliovirus, it has been shown that differences in IRES efficiency in 
different host-cell environments can modulate host range and virulence (Shiroki et al., 1997). 

BVDV-HCV chimeras that are dependent on a functional HCV IRES may have 
another practical application. It may be possible to use these chimeras to screen for anti-HCV 

20 therapeutics that target the HCV IRES. Other researchers have shown antisense 

oligonucleotide-mediated inhibition of HCV gene expression in hepatocytes by targeting the 
oligonucleotides to the HCV IRES (Hanecak et al., 1996). It will be of interest to measure the 
efficacy of antisense oligonucleotides or ribozymes (Lieber et al., 1996) against replicating 
virus, and these chimeras are more useful than HCV for this purpose since they are able to 

25 replicate efficiently in cell culture. BVDV is believed to be a reasonable model of HCV 
replication not only because of homology and conserved motifs within the 5' NTR but also 
because of similarities in overall genetic organization (Rice, 1996) and poiyprotein processing 
strategy (Tautz et al., 1997; Xu et al., 1997). 

In view of the above, it will be seen that the several advantages of the invention are 

30 achieved and other advantageous results attained. 

As various changes could be made in the above methods and compositions without 
departing from the scope of the invention, it is intended that all matter contained in the above 
description and shown in the accompanying drawings shall be interpreted as illustrative and 
not in a limiting sense. 
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All references cited in this specification, including patents and patent applications, are 
hereby incorporated by reference. The discussion of references herein is intended merely to 
summarize the assertions made by their authors and no admission is made that any reference 
constitutes prior art. Applicants reserve the right to challenge the accuracy and pertinency of 
5 the cited references. 
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What is Claimed is: 

1 . A polynucleotide comprising a chimeric viral UNA which comprises: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 
5 (c) a 3 ' nontranslated region (3 ' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pesti virus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric viral RNA is replication-competent. 

10 2. The polynucleotide of claim 1 , wherein the chimeric region is the 5 ' NTR and 

the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

3. The polynucleotide of claim 2, wherein the BVDV nucleotide sequence is 
located at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU. 

15 

4. The polynucleotide of claim 3, wherein the first HCV nucleotide sequence in 
the chimeric 5 ' NTR comprises an internal ribosome entry site (IRES). 

5 . The polynucleotide of claim 4, wherein the ORF and the 3 ' NTR consist of 
20 second and third BVDV sequences. 

6. The polynucleotide of claim 5, wherein the 5' terminal sequence comprises 5' 

GUAU. 

25 7. The polynucleotide of claim 4, wherein the ORF comprises a second HCV 

sequence encoding at least one structural protein operably linked to a second BVDV 
sequence. 

8. The polynucleotide of claim 1 , wherein the pestivirus is BVDV and the 
30 chimeric region is the 3 ' NTR. 

9. The polynucleotide of claim 8, wherein the first HCV sequence in the 
chimeric 3 ' NTR comprises the HCV 98 bp 3 ' terminal element (SEQ ID NO:X) operably 
linked to the first BVDV sequence. 

35 



8NSDOCID; <WO 9055366A1 » > 



W099/55366 PCT/US99/08850 

35 

10. A method for identifying compounds having antiviral activity against 
hepatitis C virus (HCV) comprising the steps of: 

(a) providing a first cell containing a chimeric viral RNA which is replication- 
competent in the cell, the chimeric viral nucleic acid comprising a 5' nontranslated region (5' 

5 NTR), an open reading frame (ORF) region; and a 3 ' nontranslated region (3 ' NTR); 
wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV); 

(b) providing a second cell containing the pestivirus; and 

1 0 (c) comparing the replication efficiency of the chimeric viral RNA acid in the 

presence and absence of a test compound to the replication efficiency of the pestivirus in the 
presence and absence of the test compound, 

wherein a greater reduction in compound-induced replication efficiency of the chimeric viral 
RNA than the pestivirus indicates the compound has anti-HCV activity. 

15 

1 1 . The method of claim 1 0, wherein the chimeric region is the 5 ' NTR and the 
first pestivirus nucleotide sequence is from a bovine viral diarrhea virus (BVDV). 

1 2. The method of claim 1 1 , wherein the BVDV nucleotide sequence is located 
20 at the 5 ' terminus of the chimeric 5 ' NTR and comprises 5 ' RUAU. 

1 3 . The method of claim 1 2, wherein the first HCV nucleotide sequence in the 
chimeric 5' NTR comprises an internal ribosome entry site (IRES). 

25 14. The method of claim 1 3 , wherein the ORF and the 3 ' NTR comprise second 

and third sequences from the BVDV. 

1 5 . The method of claim 1 0, wherein the pestivirus is BVDV and the chimeric 
region is the 3' NTR. 

30 

1 6. A genetically-engineered vims comprising a chimeric RNA genome which 
comprises: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 
35 (c) a 3 ' nontranslated region (3 ' NTR); 
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wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a first nucleotide sequence from an hepatitis C 
virus (HCV), and wherein said chimeric RNA genome is replication-competent. 

5 17. The genetically-engineered vims of claim 16, wherein the chimeric region is 

the 5 ' NTR and the first pestivirus nucleotide sequence is from a bovine viral diarrhea virus 
(BVDV). 



1 8. The genetically-engineered virus of claim 16, wherein the BVDV nucleotide 
1 0 sequence is located at the 5' terminus of the chimeric 5' NTR and comprises 5' RUAU and 

the first HCV nucleotide sequence in the chimeric 5 ' NTR comprises an internal ribosome 
entry site (IRES). 

19. A vaccine against bovine viral diarrhea virus (BVDV) comprising an 

1 5 immunogenically-effective amount of a genetically-engineered virus comprising a chimeric 
RNA genome having: 

(a) a 5' nontranslated region (5' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3' nontranslated region (3' NTR); 

20 wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from BVDV in operable linkage with a first nucleotide sequence from an hepatitis C virus 
(HCV), and wherein the genetically-engineered virus is attenuated as compared to BVDV. 

20 . The vaccine of claim 1 9, wherein the chimeric region is the 5 ' NTR and the 
25 BVDV nucleotide sequence is located at the 5' terminus of the chimeric 5' NTR and 

comprises 5 ' RUAU and the first HCV nucleotide sequence in the chimeric 5 ' NTR 
comprises an internal ribosome entry site (IRES). 



21. A polynucleotide comprising a chimeric viral RNA which comprises: 
30 (a) a 5 ' nontranslated region (5 ' NTR); 

(b) an open reading frame (ORF) region; and 

(c) a 3' nontranslated region (3' NTR); 

wherein at least one of said regions is chimeric and comprises a first nucleotide sequence 
from a pestivirus in operable linkage with a heterologous nucleotide sequence and wherein 
35 said chimeric viral RNA is replication-competent 
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pACNR/BVO NADL-Xba* -> Genes 

DMA sequence 15065 b.p. gtatacgagaat ... cgactcactata circular 

pAGNR/BVD NADL-Xba = Haell and Xhol digest of pACNR/BVD NADL ligated to 

Haell and Xhol digest of pACNR1180/Drall WBVDS* 
8/27 corrected nt 12136 G to C to give Hpal site. 

Co 

I g t a tacgagaa 1 1 agaaaaggcac t eg t a t acg tat tgggcaa 1 1 aaaaa t aa taa 1 1 aggee t agggaacaaa t ccc t c 80 

81 tcagcgaaggccgaaaagaggctagccatgcccttagtaggactagcataatgaggggggtagcaacagtggtgagttcg 160 

161 ttggatggcttaagccctgagtacagggtagtcgtcagtggttcgacgccttggaataaaggtctcgagatgccacgtgg 240 

241 acgagggcatgcccaaagcacatcttaacctgagcgggOTtcgcccaggtaaaagcagttttaaccgactgctacgaata 320 

321 cagcctgatagggtgctgcagaggcccactgtattgctactaaaaatctctgctgtacatggcac ATG GAG TTG 394 

1 MEL 3 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 

4ITNE L L Y KTYKQK PVGVE E P 23 

455 GTT TAT GAT CAC GCA GOT GAT CCC TTA TTT GOT CAA AGO GGA GCA GTC CAC CCT CAA TOG 514 

24VYDQAG DP LFGERGAVH P QS 43 

515 ACG CTA AAC CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 574 

44TLKLPHKRGERDVPTNLASL 63 

575 CCA AAA AGA GCT GAC TGC AGG TOG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 634 

64PKRGDC RSGNSRG PVSGI YL 83 

635 AAG CCA GGG CCA CTA TTT TAC CAC GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCC CTC 694 

B4KPGPLFY0DYKGPVYHRAPL 103 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CCC ATA GGG AGA OTA ACT GGA 754 

104 ELFEEGSMCETTKRIGRVTG 123 

755 AGT GAC GGA AAG CTG TAC CAC ATT TAT GTC TGT ATA CAT GGA TGT ATA ATA ATA AAA AGT 814 

124 SDGKLYHIYVCIDGCI I I KS 143 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TGC GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 

144 ATRSYQRVFRWVHNRLDCPL. 163 

875 TGG CTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAS AAA 934 

164 WVTTC SDTKEEGATKKKTQK 183 ^ 



935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 994 



184 PDRLERGKKKIVPKESEKDS 203 fc, 

995 AAA ACT AAA CCT CCC GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 

204 KTKPPDATIVVECVKYQVRK 223 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAC GAC GGC TTG TAC CAT AAC AAA AAC AAA CCT 1114 

224 KGKTKSKNTODGLYHNKNKP 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT ATA GTT 1174 

244 QESRKKLEKALLAWAI I A I V 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 1234 

264 LFQVTHGENITQWNLQDNCT 283 

1235 GAA GGG ATA CAA CGG GCA ATG TTC CAA AGG GGT GTG AAT AGA AGT TTA CAT GGA ATC TOG 1294 

284 EGIQRAMFQRGVNRSLHGIW 303 

1295 CCA GAG AAA ATC TGT ACT GCT GTC CCT TCC CAT CTA GCC ACC CAT ATA GAA CTA AAA ACA 1354 

304 PEK I CTGVPSH L A T D I ELKT 323 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 1414 

324 IHGMMDASEKTNYTCCRLQR 343 

1415 CAT GAG TGG AAC AAG CAT GCT TGG TCC AAC TGC TAC AAT ATT GAA CCC TGG ATT CTA GTC 1474 

344 HEWNK HCWCNWYN I EPWX L V 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 1534 

364 MNRTQANLTEGQPPRECAVT 383 

1535 TCT AGG TAT GAT AGG GCT ACT GAC TTA AAC GTG OTA ACA CAA GCT AGA GAT AGC CCC ACA 1594 

384 CRYDR ASDLNVVTQARDSPT 403 

1595 CCC TTA ACA GCT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATG CGG GGC 1654 

404 PLTGCKKGKNFSFAG1 LMRG 423 
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1655 CCC TGC AAC TTT GAA ATA GCT GCA ACT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT ACT 1714 

424 PCNFEIAASOVLFKEHERI S 443 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GCT GCC 1774 

444 HFQDTTLYLVDGLTNSLEGA 463 

1775 AGA CAA GGA ACC GCT AAA CTG ACA ACC TGG TTA GGC AAG CAG CTC GGG ATA CTA GGA AAA 1834 

464 RQCTAKLTTWLCKQLGILGK 483 

1835 AAG TTG GAA AAC AAG AGT AAG ACG TGG TTT GGA GCA TAC GCT GCT TCC CCT TAC TGT CAT 1894 

484 KLENKSKTWFGAYAASPYC 0 503 

189S GTC GAT CGC AAA ATT GCC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TCC TTA CCC 1954 

504 VDRKIGYI WYTKNCTPAC L P 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGXI 543 

2015 TTA CAT GAG ATG GGG GCT CAC TTG TCC GAG GTA CTA CTA CTT TCT TTA GTG CTG CTC TCC 2074 

544 LHEMGGHLSEVLLLSLVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AGT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLI LHFSI P Q 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLTVE LT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TOG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEV1 PGSVWNLGKYVC I R P 623 

2255 AAT TOG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWPYETTVVLAFEEVSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA CAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIVfNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQHVQGI LW 6B3 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TOG TAT GCC 2494 

684 LLLITGVQGHLDCKPEFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GOT CAA CTG GOG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKDERIGQLGAEGLTTTWK 723 

2555 CAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC CAA GAT GGG 2614 

724 EYS PGMKLEDTMVIAWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTRETRYLAI LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CCA AAG CAA GAG GAT 2734 

764 RALPTSVVPKKLFDGRKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TCC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFEFGLCPCDAK P I 803 



o 



2795 GTA AGA GGG AAG TTC AAT ACA ACQ CTG CTG AAC GGA CCG GCC TTC CAG ATG GTA TGC CCC 2854 

804 VRGKFNTTLLNGPAFQMVC P 823 

2855 ATA GGA TGG ACA GOG ACT GTA AGC TOT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA COG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGO CAA GGC TGT ATC ACC CAA 2974 

844 VVRTYRRSKPPPHR0GCITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEOLHNCI LGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GDQLLYKGGSX ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYP IGKCKLENE 923 

3155 ACT OCT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA OCT GTG GCC ATA GTA CCA 3214 

924 TGY RLVDSTSCNREGVAI V P 943 

3215 CAA GOG ACA TTA AAG TCC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 OCT LKCK IGKTTVQVI AMOT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLGPMPCRPYEI ISSEGPVE 993 

3335 AAG ACA GCG TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFN YTKT LKNKY FE PR 1003 
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3395 GAC ACC TAC TTT CAG CAA TAC ATG CTA AAA CGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 

1004 DSYFQQYMLKCEYQYWPDLE 1023 

3455 CTG ACT GAC CAT CAC CGG GAT TAC TTC GCT GAG TCC ATA TTA GTC GTG GTA CTA GCC CTC 3514 

1024 VTDHHRDYFAESILVVVVAi, 1043 

3515 TTC GGT GGC AGA TAT GTA CTT TGG TTA CTC GTT ACA TAC ATG GTC TTA TCA CAA CAG AAG 3574 

1044 LGGRYVLWLLVTYMVLSEQK 1063 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GGC GAA GTG GTG ATG ATG GGC AAC TTC CTA ACC CAT 3634 

1064 ALCI OYGSGEVVHMGNLLTH 1083 

3635 AAC AAT ATT GAA GTG GTC ACA TAC TTC TTC CTG CTC TAC CTA CTG CTC AGG GAG GAG AGC 3694 

1084 NNIEVVTYFLLLYLLLREES 1103 

3695 GTA AAG AAG TGG CTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VKKWVLLLYHILVVHPIKS V 1123 

3755 ATT GTG ATC CTA CTC ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG GGC CAA GAG TAC 3814 

1124 I VI L L M I GDVVKADSGGQEY 1143 

3815 TTC GGG AAA ATA GAC CTC TCT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 3874 

1144 LGKI DLCFTTVVLIVIGLI I 1163 

3875 GCC AGG CGT GAC CCA ACT ATA GTC CCA CTG GTA ACA ATA ATG GCA GCA CTG AGG GTC ACT 3934 

1164 ARRDPT1VPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC GCT GTG GOG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHOPGVDIAVAVMTITL L 1203 

?™5 *GC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAG TGC ATT CTC AGC 4054 

1204 MVSYVTDYFRYKKWLOCI LS 1223 

4055 CTG GTA TCT GCG GTG TTC TTC ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 4114 

1224 LVSAVFLIRSLIYLGRIEMP 1243 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTC ATC TCA ACA 4174 

1244 EVTI PNWRPLTLI LLYLI ST 1263 

4175 ACA ATT GTA ACC AGG TGG AAG GTT GAC GTG GCT GGC CTA TTC TTG CAA TCT GTG OCT ATC 4234 

1264 T IVTRWKVDVAGLLLQCVPI 1283 

4235 TTA TTG CTC GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 L L L V TT L W A D F L T L I L I L P T 1303 ^ 

4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA AGT TGG 4354 C5 

1304 YELVKLYYLKTVRTDIERSW 1323 S 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 

1324 LGGIDYTRVDSIYDVDESGE 1343 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGC AAT TTT TCT ATA CTC TTG CCC 4474 

1344 GVYLFPSROKAQGNFSILLP 1363 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TOG CAG CTA ATA TAC ATG AGT 4534 

1364 LIKATLISCVSSKWQLIYMS 1383 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA GAA GAG ATC TCA GGA 4594 

1384 YLTLDFMYYMMRKVIEEISG 1403 

4595 GGT ACC AAC ATA ATA TCC AGC TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTN1 1SRLVAALIELNWSME 1423 

4655 GAA GAG GAG ACC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGG TTG AGA AAC CTA 4714 

1424 EEESKGLKKFYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG GTA AGG AAT GAC ACC GTG GCT TCT TGG TAC GCG GAG GAG GAA GTC 4774 

1444 I IKHKVRNETVASWYGEEEV 1463 

4775 TAC GGT ATC CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG AGT AAG AGC AGG CAC 4834 

1464 YGNPKIMTI I KASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TCT GAG GGC CGA GAG TGG AAA GGT GGC ACC TGC CCA AAA TCT 4894 

1484 CI I CTVCEG R EWKGGTC PKC 1503 

4895 GGA CCC CAT GGG AAG CCG ATA ACG TCT GGG ATC TCC CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHCKP1TCCMSLADFEERH 1523 

4955 TAT AAA ACA ATC TTT ATA AGG GAA GGC AAC TTT GAG GGT ATG TGC AGC CGA TGC CAC GGA 5014 

1524 YKRIFIREGNFEGMCSRCOG 1543 

5015 AAG CAT AGG AGG TTT GAA ATG GAC CGG GAA CCT AAG AGT GCC AGA TAC TGT GCT GAG TGT 5074 

1544 KHRRFEMDRE PKSARYCAEC 1563 

5075 AAT AGG CTG CAT CCT GCT GAG GAA OCT GAC TTT TGC GCA GAG TOG AGC ATG TTG GGC CTC 5134 

1564 NRLHPAEECDFWAESSMLGL 1583 



E 
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5135 AAA ATC ACC TAC TTT GCG CTG ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TGG GCT GGA 5194 

1504 KITYFALMDGKVYDITEWAG 1603 

5195 TGC CAG CGT GTG GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC TCA TTT GGT 5254 

1604 CQRVCTSPDTHRVPCHISFG 1623 

5255 TCA CGG ATG CCT TTC AGG CAG GAA TAC AAT GCC TTT CTA CAA TAT ACC GCT AGG GGG CAA 5314 

1624 SRMPFRQEYNCFVQYTARGQ 1643 

5315 CTA TTT CTG AGA AAC TTC CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GCC AAC 5374 

1644 LFLRNLPVLATKVKMLMVGN 1663 

5375 CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG 5434 

1664 LGEE I GNLEHLGWI LRGPAV 1683 

5435 TCT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 5494 

1684 CKKITEHEKCHINILDKLTA 1703 

5495 TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA CCC AGA GCC COG GTG AGG TTC CCT ACG AGC 5554 

1704 FFGIMPRGTTPRAPVRFPTS 1723 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TGG GCT TAC ACA CAC CAA GCC GGG ATA 5614 

1724 LLKVRRGLETAWAYTHQGG I 1743 

5615 AGT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CCA 5674 

1744 SSVDHVTAGKDLLVCDSMGR 1763 

5675 ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG TTC ACC GAT GAG ACA GAG TAT GGC GTC AAG 5734 

1764 TRVVCQSNNRLTDETEYGVK 1783 

5735 ACT GAC TCA GOG TGC CCA GAC GGT GCC AGA TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC 5794 

1784 TDSGC PDGARCYVLNPEAVN 1803 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAG ACA GGT GGA GAA TTC ACG TGT 5854 

1804 ISGSKGAVVHLQKTGGEFTC 1823 

5855 GTC ACC GCA TCA GGC ACA COG GCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC 5914 Tf 

1824 VTASGTPAPFDLKNLKGWSG 1843 ^ 

5915 TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT 5974 rV) 

1844 LPT FEASSGRVVGRVKVGKN 1863 H 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG ACT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 6034 p 

1864 EESKPTKIMSGIQTVSKNRA 1883 rj 

6035 GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT 6094 

1884 DLTEMVKKI TSMNRGDFKQ I 1903 

6095 ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 6154 

1904 TLATGAGKTTELPKAVI EEI 1923 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGC GCA GOG GCA GAG TCA GTC TAC 6214 

1924 GRHKRVLVL1PLRAAAESVY 1943 

6215 CAG TAT ATG AGA TTC AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 6274 

1944 QYMRLKHPS I SFNLRIGDMK 1963 

6275 GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT 6334 

1964 ECDMATCITYASYGYFCQMP 1983 

6335 CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 6394 

1984 Q/PK LRAAMVEYSYI PLDEYH 2003 

6395 TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC GGG AAG ATC CAC AGA TTT TCA GAG ACT ATA 6454 

2004 CAT PEQLAI IGKIHRFSES I 2023 

6455 AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA GGG TOG GTG ACC ACA ACA GGT CAA AAG CAC 6514 

2024 RVVAMTATPAGSVTTTGQKH 2043 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATG AAA GGG GAG GAT CTT GGT ACT CAG TTC 6574 

2044 p X EEF IAPEVMKGEDLGSQF 2063 

6575 CTT GAT ATA GCA GGC TTA AAA ATA CCA GTG GAT CAG ATG AAA GGC AAT ATG TTG GTT TTT 6634 

2064 L D I AGLKIPVDEMKGNMLVF 2083 

6635 GTA CCA ACG AGA AAC ATC GCA GTA GAC GTA GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC 6694 

2084 VPTRNMAVEVAKKLKAKGYN 2103 

6695 TCT GGA TAC TAT TAC ACT GGA GAG GAT CCA GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC 6754 

2104 SGYYY SGEDPANLRVVTSQS 2123 

6755 CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT GAA TCA GGA GTC ACA CTA CCA GAT TTG GAC 6814 

2124 PYVIVATNAI ESGVTLPDLD 2143 

6815 ACG GTT ATA GAC ACG GGG TTG AAA TGT GAA AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC 6874 

2144 TV I DTGLKCEKRVRVSSK I P 2163 
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687S TTC ATC GTA ACA GGC CTT AAG AGG ATC GCC GTG ACT GTC GGT GAG CAG GOG CAG CCT ACG 6934 

2164 FIVTGLKRMAVTVCEQAQRR 2183 

6935 GGC ACA GTA GGT AGA GTG AAA CCC GOG AGG TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG 6994 

2184 GRVGRVKPCRYYRSQETATG 2203 

6995 TCA AAG GAC TAC CAC TAT CAC CTC TTG CAG GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC 7054 

2204 SKDYHYDLLQAQRYGI EDGI 2223 

7055 AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT TAC GAT TOG AGC CTA TAC GAG GAG GAC AGC 7114 

2224 NVTKSFREMNYDWSLY EEDS 2243 

7115 CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC 7174 

2244 LLITQLEI LNNLLI SEDLPA 2263 

7175 GCT GTT AAG AAC ATA ATG GCC AGG ACT GAT CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC 7234 

2264 AVKNIMARTDHPEPIQLAYN 2283 

7235 AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC CCA AAA ATA ACG AAT GGA GAA GTC ACA GAC 7294 

2284 SYEVQVPVLFPKIRNGEVTD 2303 

7295 ACC TAC GAA AAT TAC TCG TTT CTA AAT GCC AGA AAG TTA GGG GAG GAT GTG CCC GTG TAT 7354 

2304 TYENYSFLNARKLGEDVPVY 2323 

7355 ATC TAC GCT ACT GAA GAT GAG GAT CTC GCA GTT GAC CTC TTA GGG CTA GAC TGG CCT GAT 7414 

2324 IYATEDEDLAVDLLG LDWPD 2343 

7415 CCT GGG AAC CAG CAG GTA GTG GAG ACT GGT AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC 7474 

2344 PGNQOVVETGKALKQVTGLS 2363 

7475 TCG GCT GAA AAT GCC CTA CTA GTC GCT TTA TTT GGG TAT GTG GGT TAC CAG CCT CTC TCA 7534 

2364 SAENALLVALFGYVGYQALS 2383 

7535 AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC 7594 

2384 KRHVPHITDI YTI EDQRLEO 2403 

7595 ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG 7654 

2404 TTHLQYA PNA I KTDGT ETEL 2423 

7655 AAA GAA CTG GCG TCG GGT GAC GTG GAA AAA ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT 7714 *Q 

2424 KELASGDVEK IMGAI SDYAA 2443 o 

7715 GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA 7774 M 

2444 GGLEFVKSQAEKIKTAPLFK 2463 S 

7775 GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT 7834 £3 

2464 ENAEAAKGYVQKF I DSLIEN 2483 ^ 

7835 AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA 7894 £ 

2484 KEEI I RYGLWGTHTALYKSI 2503 

7895 GCT GCA ACA CTG GGG CAT CAA ACA GCC TTT GCC ACA CTA GTG TTA AAC TGG CTA GCT TTT 7954 

2504 AARLGHETAFATLVLKWLAF 2523 

7955 GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG CAG GCG GCA GTT GAT TTA GTG CTC TAT TAT 6014 

2524 GGESV5DHVKQAAVDLVVYY 2543 

8015 GTG ATG AAT AAG OCT TCC TTC CCA GGT GAC TCC GAG ACA CAG CAA GAA GGG AGG CCA TTC 8074 

2544 VMNKPSFPGDSETOQBGRRF 2563 

8075 GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC 8134 

2564 VASLFISALATYTYKTWNYH 2583 

8135 AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA 8194 

2584 NLSKVVE PAIAYLPYATSAL 2603 

8195 AAA ATG TTC ACC CCA ACG COG CTG GAG AGC GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA 8254 

2604 KMFTPTR LESVVI LSTTIYK 2623 

8255 ACA TAC CTC TCT ATA AGG AAG GGG AAG ACT GAT GCA TTG CTG GGT ACG GGG ATA AGT GCA 8314 

2624 TYLS1 RKGKS DGLLGTGI SA 2643 

8315 GCC ATC GAA ATC CTG TCA CAA AAC CCA GTA TCG GTA GGT ATA TCT GTG ATG TTG GGG GTA 8374 

2644 AMEILSQNPVSVGI S V M L G V 2663 

8375 GOG GCA ATC GCT GCG CAC AAC GCT ATT GAG TCC AGT GAA CAG AAA AGG ACC CTA CTT ATG 8434 

2664 GAIAAHNAIESSEQKRTLLH 2683 

8435 AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC 8494 

2684 KVFVKNFLDQAATDELVKEN 2703 

8495 CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA GCA GTC CAG ACA ATT GGT AAC CCC CTG AGA 8554 

2704 PEKIIMALFEAVQTIGNPLR 2723 

8555 CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC AAA GGT TOG GAG GCC AAG GAA CTA TCT GAG 8614 

2724 L1YHLYGVYYKGWEAKELSE 2743 
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8615 AGG ACA GCA GGC AGA AAC TTA TTC ACA TTC ATA ATC TTT GAA CCC TTC GAG TTA TTA GGG 8674 

2744 RTAGRNLFTLI MFEAFELLC 2763 

8675 ATC GAC TCA CAA GGG AAA ATA AGG AAC CTG TCC GGA AAT TAG ATT TTG GAT TTG ATA TAC B734 

2764 MDSOGKI RNLSGNY I LDLI Y 2783 

8735 GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG AAG AAA ATG GTA CTG GGC TGG GCC CCT CCA B794 

2784 CLHKO INRGLKKMVLGWAPA 2803 

8795 CCC TTT ACT TCT GAC TGG ACC CCT ACT GAC GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT 8854 

2804 PFSCDWTPSDERI RLPTDNY 2823 

8855 TTG AGG GTA CAA ACC AGG TGC CCA TGT GGC TAT GAG ATG AAA GCT TTC AAA AAT GTA OCT 8914 

2824 LRVETRCPCGYEMKAFKNVG 2843 

8915 GGC AAA CTT ACC AAA GTG GAG GAG ACC GGG CCT TTC CTA TGT AGA AAC AGA CCT OCT AGG 8971 

2844 GKLTKVEESGPFLCRNRPGR 2863 

8975 GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA 9034 

2864 GPVNY RVTKYY DDNLREI KP 2883 

9035 CTA CCA AAG TTG GAA GGA CAG GTA GAG CAC TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC 9094 

2884 VAKLEGQVEHY YKGVTAK I D 2903 

9095 TAC ACT AAA GGA AAA ATG CTC TTG GCC ACT GAC AAG TGG GAG GTG GAA CAT GCT GTC ATA 91S4 

2904 YSKGKMLLATDKWEVEHGVI 2923 

9155 ACC AGG TTA CCT AAG AGA TAT ACT GGC GTC GOG TTC AAT GGT GCA TAC TTA GCT CAC GAG 9214 

2924 TRLAK RYTGVG FNGAYLGDE 2943 

9215 CCC AAT CAC CCT GCT CTA GTG GAG AGG GAC TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG 9274 

2944 PNHR A LVERDCAT ITKNTVQ 2963 

9275 TTT CTA AAA ATG AAG AAG GGG TGT GOG TTC ACC TAT GAC CTG ACC ATC TCC AAT CTG ACC 9334 

2964 FLKMKKGCAFTYDLTISMLT 2983 

9335 AGG CTC ATC GAA CTA GTA CAC ACC AAC AAT CTT GAA GAG AAG GAA ATA CCC ACC GCT ACQ 9394 vA 

2984 R L I EUVHRNNLEEKEIPTAT 3003 ^ 

9395 CTC ACC ACA TGG CTA GCT TAC ACC TTC GTG AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA 9454 

3004 VTTW LAYT FVN EDVGTI K PV 3023 £J 

9455 CTA GGA GAG AGA GTA ATC CCC GAC CCT GTA GTT GAT ATC AAT TTA CAA CCA CAG GTG CAA 9514 5 

3024 LGERVIPDPVVDINLQPEVQ 3043 *T 

W 

9515 GTC GAC ACC TCA CAG GTT GGG ATC ACA ATA ATT GGA AGG GAA ACC CTG ATC ACA ACG GGA 9574 

3044 VDTSEVGITI I GRETLMTTG 3063 **< 

9575 GTG ACA CCT GTC TTG GAA AAA GTA GAG CCT GAC GCC AGC GAC AAC CAA AAC TCC GTG AAG 9634 

3064 VTPVLEKVEPDASDMQNSVK 3083 

9635 ATC GGG TTG GAT GAG GGT AAT TAC CCA GGG CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA 9694 

3084 IGLDEGNY PGPGI QTHTLTE 3103 

9695 GAA ATA CAC AAC AGG GAT GCG AGG CCC TTC ATC ATG ATC CTG GGC TCA AGG AAT TCC ATA 9754 

3104 EIHNRDARPFI HI LGSRNSI 3123 

9755 TCA AAT AGC GCA AAG ACT GCT AGA AAT ATA AAT CTC TAC ACA GGA AAT GAC CCC AGC GAA 9814 

3124 SNRAKTARNINLYTGNDPRE 3143 

9815 ATA CCA GAC TTG ATG GCT GCA GGC CCC ATG TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT 9874 

3144 IRDLMAAGRMLVVALRDVDP 3163 

9875 GAC CTC TCT GAA ATG CTC GAT TTC AAG GGG ACT TTT TTA GAT AGG GAG GCC CTG GAG GCT 9934 

3164 ELSEMVDFKGTFLDREALtEA 3183 

9935 CTA ACT CTC GCG CAA CCT AAA CCC AAG CAG GTT ACC AAC GAA CCT GTT AGG AAT TTC ATA 9994 

3184 LS LGQPK PKQVTKEAVR N L I 3203 

9995 GAA CAG AAA AAA GAT CTG GAC ATC CCT AAC TGG TTT GCA TCA GAT GAC CCA GTA TTT CTG 10054 

3204 EQKKDVEI PNWFASDDPVFL 3223 

10055 GAA GTC GCC TTA AAA AAT GAT AAG TAC TAC TTA GTA GGA GAT GTT GGA GAG CTA AAA GAT 10114 

3224 EVALKNDKYYLVGDVCELKO 324 3 

10115 CAA CCT AAA GCA CTT GGG CCC ACG CAT CAG ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG 10174 

3244 QAKALGATDQTRI IKEVGSR 3263 

10175 ACG TAT GCC ATG AAC CTA TCT AGC TGC TTC CTC AAG GCA TCA AAC AAA CAG ATG ACT TTA 10234 

3264 TYAMKLSSWFLKASNKQMSL 3283 

10235 ACT CCA CTG TTT GAG GAA TTC TTG CTA CCC TCC CCA CCT GCA ACT AAG AGC AAT AAG GGG 10294 

3284 TPLFEELLLRC PPATKSNKG 3303 

10295 CAC ATG GCA TCA GCT TAC CAA TTG GCA CAG GGT AAC TGG GAC CCC CTC GGT TCC GGG GTG 10354 

3304 HMASAYQLAQGNWEPLGCCV 3323 
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10355 CAC CTA GGT AC A ATA CCA CCC AGA AOG GTG AAG ATA CAC CCA TAT GAA GCT 
3324 H LGTI PARRVK I HPYEA 



TAC CTC AAG 
Y L K 



10414 
3343 



10415 TTG AAA GAT TTC ATA GAA GAA GAA GAG AAG AAA CCT AGG GTT AAG GAT ACA 
3344 LKDFIEEEEKKPRVKDT 

10475 GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA AGG TTT CAA GGA AAC CTC AAC 
3364 EHNKWI LKKIRFQGNLN 

10535 ATG CTC AAC CCG GGG AAA CTA TCP GAA CAG TTC CAC AGG GAG GGG CGC AAG 
3384 MLNPGKLSEQLDREGRK 

10595 TAC AAC CAC CAG ATT GGT ACT ATA ATG TCA ACT GGA GGC ATA AGG CTC GAG 
3404 YNHQIGTIMSSAGIROE 

10655 ATA GTG AGG GCC CAA ACC GAC ACC AAA ACC TTT CAT GAG GCA ATA AGA GAT 
3424 I VRAQT DTKTPHEAI RD 

10715 AAG AGT GAA AAC CGG CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT 
3444 KSENRONPELHMKLLEI 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG GTG ACC TGG GAC CAA 
3464 I AQPTL KHTYGEVTWEQ 

10835 GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC CTG GAG AAG AAG AAC ATC GGA 
3484 GINRKG AAGFLEKKNIG 

10895 GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG GTC AGG GAT CTG AAG GCC GGG 
3504 DSEKHLVEQLVRDLKAG 

10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA GAT GTC AGT GAT 
3524 KYYETA I PKNEKRDVSD 

11015 GCA GGG GAC CTG GTC GTT GAC AAG AGO CCA AGA GTT ATC CAA TAC CCT GAA 
3544 AGDLVVEKRPRVIQYPE 

11075 AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC TGG GTG AAA CAG CAG CCC GTT 
3564 RLAITKVMYNWVKQQPV 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA GTG AGA AAG 
3584 GYEGKT PLFNI PDKVRK 

11195 TCG TTC AAT GAG CCA GTG GCC GTA AGT TTT GAC ACC AAA GCC TGG GAC ACT 
3604 SFNEPVAVSPDTKAWDT 

11255 AGT AAG GAT CTC CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAC AAC 
3624 SKDLQL IGEIQKYYYKK 

11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG ACA GAA GTA CCA GTT ATA ACA 
3644 KFIDTI TDHMTEVPVIT 

11375 GAA GTA TAT ATA AGA AAT CGG CAG AGA GGG AGC GGC CAG CCA GAC ACA AGT 
3664 EVY I RNGQRGSGQPDTS 

11435 AGC ATG TTA AAT GTC CTG ACA ATC ATG TAC GGC TTC TGC GAA AGC ACA CGG 
3684 SMLNVLTMMYGFCESTG 

11495 AAG AGT TTC AAC AGG GTC GCA AGG ATC CAC GTC TOT GGG GAT GAT GGC TTC 
3704 KSFNRVAR 1 HVCGDDGF 

11555 GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC AAA GGG ATG CAG ATT CTT CAT 
3724 EKGLGLKFANKGMQILH 

11615 AAA CCT CAG AAC ATA ACC GAA GGG GAA AAC ATG AAA GTT GCC TAT ACA TTT 
3744 KPQKITEGEKMKVAYRF 

11675 GAG TTC TGT TCT CAT ACC CCA CTC CCT GTT AGG TGG TCC GAC AAC ACC AGT 
3764 EFCSHTPVPVRWSDNTS 

11735 GCC GGG AGA GAC ACC GCT GTG ATA CTA TCA AAG ATG GCA ACA AGA TTG GAT 
3784 AGRDTAVILSKMATRLD 

11795 GAG AGG GGT ACC ACA GCA TAT GAA AAA GOG GTA GCC TTC AGT TTC TTG CTG 
3804 ERGTTAYEKAVAFSFLL 

11855 TGG AAC CCG CTT GTT AGG AGG ATT TGC CTG TTG GTC CTT TCG CAA CAG CCA 
3824 WNPLVRRICLLVL5QQP 

11915 CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA GGT GAT CCA ATA GGG GCC TAT 
3844 PSKHATYYYKGDPIGAY 

11975 ATA GGT CGG AAT CTA AGT GAA CTG AAG AGA ACA GGC TTT GAC AAA TTC CCA 
3864 IGRNLSELKRTGFEKLA 

12035 CTA AGC CTG TCC ACC TTG GGG ATC TOG ACT AAC CAC ACA AGC AAA AGA ATA 
3B84 LSLSTLGIWTKHTSKRI 



GTA ATA AGA 
V I R 



ACC AAG AAA 
T K K 



AGG AAC ATC 
R N I 



AAA TTG CCA 
K L P 



AAG ATA GAC 
KID 



TTC CAC ACC 
F H T 



CTT GAG GCG 
LEA 



GAA GTA TTC 
E V L 



AGA AAC ATA 
R K I 



GAC TGG CAG 
D W 0 



GCC AAC ACA 
A K T 



GTG ATT CCA 
VIP 



GAA TCG GAC 
E W D 



CAA GTC ACT 
O V T 



CAG TGG CAC 
E W H 



GCA GAT GGT 
A D G 



GCT GGC AAC 
A G N 



GTA CCG TAC 
V P Y 



TTA ATA ACT 
LIT 



GAA GCA GGC 
E A G 



GAC GAT ATA 
EDI 



AGT CAC ATG 
S H M 



TCA AGT GGA 

S S G 



ATG TAT TCC 
M Y S 



GAG ACA GAC 
E T D 



AAA GAT GTA 
K D V 



AAT CTA AAC 
N L N 



ATT CAG GAC 
I 0 D 



10474 
3363 



10534 
3383 



10594 
3403 



10654 
3423 



10714 
3443 



10774 
3463 



10834 
3483 



10894 
3503 



10954 
3523 



11014 
3543 



11074 
3563 



11134 
3583 



11194 
3603 



11254 
3623 



11314 
3643 



11374 
3663 



11434 
3683 



11494 
3703 



11554 
3723 



11614 
3743 



11674 
3763 



11734 
3783 



11794 
3803 



11854 
3823 



11914 
3843 



11974 
3863 



12034 
3883 



12094 
3903 



s 



S 
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12095 TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC 12154 
3904 CVAIGKEEGNWLVNADRLIS 3923 

12155 AGC AAA ACT GGC CAC TTA TAG ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 
3924 SKTGHLYIPDKGFTLQGKHY 3943 

12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT GGG ACT GAG AGA TAG 12274 
3944 EO LOLRTETNPVHGVGTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT CTG CTC ATG ACG GCC 12334 
3964 KLGPIVNLLLRRLKILLMTA 3983 

12335 GTC GGC GTC AGC AGC TGA gacaaaatgtatatactgtaaataaattaatccatgtacatagtgtatataaatat 12408 
3984 V C V S S 3989 

12409 sgccgggaccgcccaccccaagaagacgacacgcccaacacgcacagccaaacagtagccaagaccacccaccccaagat 12488 

12489 aacaccacacctaacgcacacagcaccctagccgtatgaggatacgcccgacgtccacagttggaccagggaagacctct 12568 

12569 aacagccccccgcaggtcaatcaaccagtgggaacacgcggggcacgccgcgctccagcacattgacgacccaaccctca 12646 

12649 cgtctgacagcctatcaccgtcgagcaagacgtcccccgttgaacatggcccataacaccccccgcaccaccgttcatgc 12728 

12729 aagcagacagccctaccgttcatgatgatatattcttatcttgtgcaatgtaacaccagagattctgagacacgcggcct 12808 

12809 cgccgaacaaatc^aactttcgctgagttgaaggaccagatcacgcatcttcccgacaacgcagaccgtcccgtggcaaa 12888 

12889 gcaaaagttcaaaatcaccaactggtccacctacaacaaagctctcatcaaccgtggctccctcactttctggctggatg 12968 

13049 cctactatgccggcaccgatgagggtgtcagtgaagtgctccatgtggcaggaga a aa aag gctgcaccggtgcgtcagc 13128 
13129 agaatatgtgatacaggatatatcccgcttcctcgctcactgactc^tacg« 13208 
13209 acggcccacgaacggggcggagatttcctggaagacgccaggaagatacttaacagggaagcgagagggccgcggca^ 13288 
13289 ccgt 1 1 1 tccacaggc tccgcccccc tgacaagcatcacgaaa tc tgacgc tcaaatcagtggtggcgaaacccgacagg 13368 
13369 actacaaagataccaggcgtttcccccggcggccccctcgcgcgctctcctgttcctgcctttcggtttaccggtgtcat 13448 
13449 tccgctgttatggccgcgtttgtctcattcxacgcctgacacM 13528 

13609 aaaagcaccaccggcagcagccactggtaaccgatttagaggagttagtcttgaagtcatgcgccggttaaggccaaact 13688 
13689 gaaaggacaagttttggcgactgc^cteccccaagccagtcacctcggttcaaagagttggcagcccagagaaccctcga 13768 

13849 teat taaggggt c tgacgc tcagcggaacgaaaac t cacgt taagggat t t egg c ca t gaga t t a tcaaaaagga tc t cc 13928 
13929 acc caga tccttttaaac taaaaacgaag t c t caaaccaac c t aaag ta ta ta tgagtaaac c egg cc cgacagc caeca 1 4 0 0 B 
14009 acgcttaatcagcgaggcacctatctcagt^atctgt^ 14088 
14089 t aac tacga c aegggaggge t t acca tctggccccagtgct gcaa tga taccgcgagacccacgc tcaccggc tccaga t 14168 
14169 tcatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaacttcatccgcctccatccagtctat 14248 
14249 caatcgttgccgggaagccagagtaagtagttcgccagttaatagtttgcgcaacgtcgttgccactgctgcaggcatcg 14328 
14329 tggtgtcacgctcgtcgtttggtatggcttcactcagctccggctra 14408 
14409 ccgtgcaaaaaagcggctagccccttcggtcctccgatcgttgtcagaagtaagctggccgcagtgttatcactcatggt 14488 
144B9 tacggcagcactgcataactctcttactgtcatgccatccgtaagatgcttctctgtgaccggtgagtactcaaccaagt 14568 
1 4569 cactccgagaacagcgtatgcggcgaccgagctgctcctgcccggcgtcaacacgggacaataccgcgccacacagcaga 14648 
14649 actccaaaagcgcccatcattggaaaacgctcttcggggcgaaaaccctcaaggatcttaccgccgctgagatccagttc 14728 
14729 gatgcaacccacccgtgcacccaaccgaccctcagcatctttcactctcaccagcgtctccgggcgagcaaaaacaggaa 14808 
14809 ggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcataccctccctttttcaacattattga 14888 
14889 agcatttatcagggttatcgtctcatgagcggatacacatctgaatgtactcagaaaaacaaacaaataggggctccgcg 14968 
14969 cacacctccccgaaaagtgccacccgacgccgacctgaggtaattacaacccgggccccacatatggatccaaccctaga 15048 
15049 taacacgacccaccaca 15065 
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BVDV NADL (inf. clone) •> Genes 

ENA sequence 12578 b.p. gtatacgagaat ... ctaacagccccc linear 

1 gcacacgagaactagaaaaggcacccgcatacgtattgggcaactaaaaacaacaaccaggcccagggMcaaaccccCc 80 

81 tcagcgaaggccgaaaagaggccagccacgcccccagcaggactogcacaacgaggggggtagcoacagcggtgagcccg 160 

161 tcggatggcccaagccccgagtacagggcagtcgccagcggcccgacgccttggaacaaaggccccgagatgccacgtgg 240 

24 1 acgagggcatgcccaaagcacatcctaacccgagcgggggccgcccaggcaaaagcagtt ctaaccgaccgt cacgaaca 320 

321 cagcccgatagggcgccgcagaggcccactgcaccgccactaaaaatctctgctgtacatggcac ATG GAG TTG 394 

1 MEL 3 

395 ATC ACA AAT GAA CTT TTA TAC AAA ACA TAG AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 

4ITNELLYKTYKQKPVGVEEP 23 

455 CTT TAT GAT CAG GCA GCT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 514 

24VYDQAGDPLFGERGAVHPQS 43 

515 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 574 

44TLKLPHKRGERDVPTNLASL 63 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA OCT GTG AGC GGG ATC TAC CTG 634 

64PKRGDCR SGNSRGPVSG 1 Y L 83 

635 AAG CCA GGG CCA CTA TTT TAC CAC GAC TAT AAA GGT CCC GTC TAT CAC AGG GCC CCG CTG 694 

84KPGPLFYODYKCPVYHRAPL 103 

695 GAG CTC TTT GAG GAG GGA TCC ATC TGT GAA ACG ACT AAA COG ATA GGG AGA GTA ACT GGA 754 

104 E LF EEGSMCETTKR I GRVTG 123 

755 ACT GAC GCA AAG CTG TAC CAC ATT TAT GTG TGT ATA GAT GGA TGT ATA ATA ATA AAA AGT 814 

124 SDGKLYH IYVCIDGCII IKS 143 

815 GCC ACG AGA AGT TAC CAA AGG GTG TTC AGG TGG GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 

144 ATRSYQRVFRWVHNRLDCPL 163 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 934 T 

164WVTTCSDTKEEGATKKKTQK 183 

935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTG CCC AAA GAA TCT GAA AAA GAC AGC 994 V, 

184 POR LERGKMKIVPKESEKDS 203 }*J 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 

204 KTK P PDAT IVVEGVKYQVRK 223 £J 

1055 AAG GGA AAA ACC AAG AGT AAA AAC ACT CAG GAC GCC TTG TAC CAT AAC AAA AAC AAA CCT 1114 E 

224 KGKTKSKNTQDGLYHNXNKP 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCC TGG GCA ATA ATA GCT ATA GTT 1174 

244 QES RKKLEKALLAWAI I A I V 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TCG AAC CTA CAA GAT AAT GGG ACG 1234 

264 LFQVTMGENXTQWNLQDNGT 283 

1235 GAA GOG ATA CAA CGC GCA ATG TTC CAA AGO GOT GTG AAT AGA AGT TTA CAT GGA ATC TGG 1294 

284 EGI ORAMPQRGVNRSLHGIW 303 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 1354 

304 PEK ICTGVPSHLATDIELRT 323 

1355 ATT CAT GGT ATG ATG GAT GCA AGT GAG AAG ACC AAC TAC ACG TGT TGC AGA CTT CAA CGC 1414 

324 1 HGMMDASEKTNYTCCRLQR 343 

1415 CAT GAG TGG AAC AAG CAT GGT TGC TGC AAC TOG TAC AAT ATT GAA CCC TGC ATT CTA GTC 1474 

344 HEWNKHGWCNWYNIEPWILV 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA GTC ACT 1534 

364 MNR TQANLTEGQPPRECAVT 383 

1535 TGT AGG TAT GAT AGG GCT AGT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 1594 

384 CRYDRASDLNVVTQARDSPT 403 

1595 CCC TTA ACA GCT TGC AAG AAA GGA AAG AAC TTC TCC TTT GCA GGC ATA TTG ATC CGG CGC 1654 

404 PLTGCKKGKNFSPAGI LMRG 423 

1655 CCC TGC AAC TTT GAA ATA GCT GCA AGT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT AGT 1714 

424 PCNFEIAASDVLFKEHERIS 443 

1715 ATG TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTG ACC AAC TCC TTA GAA GGT GCC 1774 

444 MFQDTTLY LVDGLTNSLEGA 463 
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1775 ACA CAA GGA ACC OCT AAA CTG ACA ACC TOG TTA GCC AAG CAG CTC GGG ATA CTA OCA AAA 1834 

464 RQGTAKLTTWI. GKQLGI LCK 483 

1835 AAG TTG GAA AAC AAG ACT AAG ACG TOG TTT GGA GCA TAC GCT OCT TCC CCT TAC TGT GAT 1894 

484 KLENKSKTWFGAYAASPYCD 503 

1895 GTC GAT CGC AAA ATT CGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TCC TTA CCC 1954 

504 VDRKICYIWYTKNCTPACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT GAC ACC AAT GCA GAG GAC GCC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGKI 543 

2015 TTA CAT GAG ATG GGG OCT CAC TTG TOG GAG GTA CTA CTA CTT TCT TTA GTG GTG CTG TCC 2074 

544 LHEHGCHLSEVLLLSLVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT ACT GTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLILHFSIPQ 583 

2135 AGT CAC GTT GAT GTA ATG GAT TGT GAT AAG ACC CAG TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQLNLTVELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCG GTC TGG AAT CTA GGC AAA TAT GTA TCT ATA AGA CCA 2254 

604 TAEVI PGSVWNLGKYVC I RP 623 

2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG ACC CAG GTG GTG 2314 

624 NWWPYETTVVLAFEEVSQVV 643 

2315 AAG TTA GTG TTG AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLTRIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGG 2434 

664 AFLVCLVKIVRGQMVQG I LW 683 

2435 CTA CTA TTG ATA ACA GOG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCG TAT GCC 2494 

684 LLL ITGVQGHLDCK PEF SYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT OCT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 IAKOERIGQLGAEGLTTTWK 723 



2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMKLEDTHVI AWCEDG 743 

2615 AAG TTA ATG TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMYLQRCTRETRYLAI LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG CGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDGRKQED 783 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 

784 VVEMNDNFEFGLCPCDAKPI 803 

279S GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCC GCC TTC CAG ATG GTA TGC CCC 2B54 

804 VRGKFNTTLLNG PAF QMVCP 823 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

B24ICWTGTVSCTSFNMDTLATT 843 

2915 GTG GTA OGG ACA TAT ACA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

844 VVRTYRRSKPFPHRQGCIT0 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNCI LGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GDQ L.L.YKGGSI ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYPIGKCKLENE 923 

3155 ACT GOT TAC AGG CTA GTA GAC AGT ACC TCT TGC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 

924 TGY RLVDSTSCNR EGVA IVP 943 

3215 CAA GGG ACA TTA AAG TGC AAC ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATC GAT ACC 3274 

944 OCT LKCKICKTTVQV I AMOT 963 

3275 AAA CTC GGA CCT ATG CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLG PMPCRPYEI 1 SSEG PVE 983 

3335 AAG ACA GCC TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAC TAT TTT GAG CCC AGA 3394 
984 KTACTFNYTKTLKNKY FEPR 1003 

3395 GAC AGC TAC TTT CAG CAA TAC ATG CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 
1004 DSY FQQYMLKGEYQYWF DLE 1023 

3455 GTG ACT GAC CAT CAC CCG GAT TAC TTC GCT GAG TCC ATA TTA GTC GTG GTA GTA GCC CTC 3514 
1024 VTOHHRDYFAES I LVVVVAL 1043 
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3515 TTC GGT CGC AGA TAT GTA CTT TOG TTA CTG GTT ACA TAC ATG GTC TTA TCA GAA CAG AAC 3574 

1044 LGCRY VLWLLVTYMV LSEQK 1063 

3575 GCC TTA GGG ATT CAG TAT GGA TCA GOG GAA CTG GTG ATG ATG GGC AAC TTC CTA ACC CAT 3634 

1064 ALGIGYGSGEVVMMGNLLTH 1083 

3635 AAC AAT ATT GAA GTG GTG ACA TAC TTC TTC CTG CTG TAC CTA CTG CTG AGG GAG GAG ACC 3694 

1084 NNIEVVTYFLLLYLLLREES 1103 

3695 GTA AAG AAG TGG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VKKWV LLLYH I LVVH PIKSV 1123 

3755 ATT GTG ATC CTA CTG ATG ATT GGG GAT GTG GTA AAG GCC GAT TCA GGG GCC CAA GAG TAC 3814 

1124 1VI LLMIGDVVKADSGGQEY 1143 

3815 TTG GGG AAA ATA GAC CTC TGT TTT ACA ACA GTA GTA CTA ATC GTC ATA GGT TTA ATC ATA 3874 

1144 LGK I D LCFTTVVLIVIGLI I 1163 

3875 GCC AGG CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA GCA CTG AGO GTC ACT 3934 

1164 ARRDPTIVPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAG CCT GGA GTT GAC ATC OCT GTG GOG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHQ PGVDI AVAVMTITLL 1203 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TGG TTA CAG TGC ATT CTC AGC 4054 

1204 MVSYVTDYFRYKKWLQCIUS 1223 

4055 CTG GTA TCT GCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GGT AGA ATC GAG ATG CCA 4114 

1224 LVSAVFLI RSLIYLGRIEMP 1243 

4115 GAG GTA ACT ATC CCA AAC TGG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTG ATC TCA ACA 4174 

1244 EVTI PNWR PLTLI LLYLIST 1263 

4175 ACA ATT GTA ACG AGG TGG AAG GTT GAC GTG GCT GGC CTA TTG TTG CAA TGT GTG CCT ATC 4234 

1264 TIVTRWKVDVAGLLLQCVPI 1283 

4235 TTA TTG CTG GTC ACA ACC TTG TGG GCC GAC TTC TTA ACC CTA ATA CTG ATC CTG CCT ACC 4294 

1284 LLLVTTLWADFLTLI LILPT 1303 
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4295 TAT GAA TTG GTT AAA TTA TAC TAT CTG AAA ACT GTT AGG ACT GAT ATA GAA AGA ACT TGG 4354 

1304 YELVK LYYLKTVRTOIERSW 1323 

4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG AGT GGA GAG 4414 

1324 LGGI DYTRVDSIYDVDESGE 1343 

4415 GGC GTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTG CCC 4474 

1344 GVYLF PSRQKAQGNFSILLP 1363 

4475 CTT ATC AAA GCA ACA CTG ATA AGT TGC GTC AGC AGT AAA TGG CAG CTA ATA TAC ATG AGT 4534 (J 

1364 LIKATLI SCVSSKWQLIYMS 1383 

4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA CTT ATA GAA GAG ATC TCA GGA 4594 

1384 YLTLD FMYYMHRKVI EEI SG 1403 

4595 OCT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTG AAC TGG TCC ATG GAA 4654 

1404 GTNI I SRLVAALI ELNW5ME 1423 

4655 GAA GAG GAG AGC AAA GGC TTA AAG AAG TTT TAT CTA TTG TCT GGA AGG TTG AGA AAC CTA 4714 

1424 BEESKGLKKPYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG CTA AGG AAT GAG ACC GTG GCT TCT TGG TAC GGG GAG GAG GAA GTC 4774 

1444 I IKHKVRNETVASWYGEEEV 1463 

4775 TAC GGT ATG CCA AAG ATC ATG ACT ATA ATC AAG GCC AGT ACA CTG AGT AAG AGC AGG CAC 4834 

1464 YGMPK I M T X I KASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TGT GAG GGC CCA GAG TGG AAA GGT GGC ACC TGC CCA AAA TCT 4894 

1484 CI I CTVCEGREWKGGTCPKC 1503 

4895 GGA CGC CAT GGG AAG CCG ATA ACG TC7T GGG ATG TOG CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGK PI TCGM SLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GGC AAC TTT GAG GGT ATG TGC AGC CCA TGC CAG GGA 5014 

1524 YKR I F I R ECNPEGMCSRCQG 1543 

5015 AAG CAT AGC AGG TTT GAA ATG GAC COG GAA CCT AAG AGT GCC ACA TAC TGT GCT GAG TGT 5074 

1544 KHRR F EHDREPKSARYCAEC 1563 

5075 AAT AGG CTG CAT CCT OCT GAC GAA GCT GAC TTT TGG GCA GAG TOG AGC ATG TTG GGC CTC 5134 

1564 NRLH PAEEGDFWAE5SMLCL 1583 

5135 AAA ATC ACC TAC TTT GCC CTC ATG GAT GGA AAG GTG TAT GAT ATC ACA GAG TGG GCT GGA 5194 

1584 KITYFALMDGKVYDITEWAG 1603 

5195 TGC CAG CGT GTG GGA ATC TCC CCA GAT ACC CAC AGA GTC CCT TGT CAC ATC TCA TTT GGT 5254 

1604 C0RVG I SPOTHRVPCHISFG 1623 
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5255 TCA CTC ATC CCT TTC AGC CAG GAA TAC AAT GGC TTT GTA CAA TAT ACC GCT AGG GGG CAA 
1624 SRHPFRQEYNGFVQYTARGQ 

5315 CTA TTT CTG AGA AAC TTG CCC GTA CTG GCA ACT AAA GTA AAA ATG CTC ATG GTA GGC AAC 
1644 LFLRNLPVLATKVKMLMVGN 

5375 CTT GGA GAA GAA ATT GCT AAT CTG GAA CAT CTT GGG TGG ATC CTA AGG GGG CCT GCC GTG 
1664 L G E E I GNLEHLGWI L R G P A V 

5435 TGT AAG AAG ATC ACA GAG CAC GAA AAA TGC CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA 
1684 CKKI TEHEKCHINI LDKLTA 

5495 TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA CCC AGA GCC CCC GTG AGG TTC CCT ACG AGC 
1704 FFGI MPRGTTPRAPVRFPTS 

5555 TTA CTA AAA GTG AGG AGG GGT CTG GAG ACT GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA 
1724 LLKVR RGLETAWAYTHQGGI 

5615 ACT TCA GTC GAC CAT GTA ACC GCC GGA AAA GAT CTA CTG GTC TGT GAC AGC ATG GGA CCA 
1744 SSVDH VTAGKDLLVCDSMGR 

5675 ACT AGA GTG GTT TGC CAA AGC AAC AAC AGG TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG 
1764 TRVVCQSNNRLTDETEYGVK 

5735 ACT GAC TCA GGG TGC CCA GAC GOT GCC AGA TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC 
17B4 TDSGC PDGARCYVLNPEAVN 

5795 ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC CTC CAA AAC ACA GGT GGA GAA TTC ACG TGT 
1804 I SGSKGAVVH LQKTGGEFTC 

5855 GTC ACC GCA TCA GGC ACA CCO GCT TTC TTC GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC 
1824 VTASGTPAFFDLKNLK GWSG 

5915 TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT 
1B44 LPI FEASSGRVVGRVKVGKN 

5975 GAA GAG TCT AAA CCT ACA AAA ATA ATG ACT GGA ATC CAG ACC GTC TCA AAA AAC AGA GCA 
1864 EES K PTKI MSGIQTVSKNRA 

6035 GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT 
1884 DLTEMVKK ITSMNRGDFKQI 

6095 ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA 
1904 TLATGAGKTTE LPKAVIEEI 

6155 GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA CCA TTA AGG GCA GCC GCA GAG TCA GTC TAC 
1924 GRHKRVLVLI PLRAAAESVY 

6215 CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA 
1944 QYMRLKHPStSFNLRICDMK 

6275 GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT 
1964 EGDMATG I TYASYGYFCQMP 

6335 CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT 
1984 QPKLRAAMVEYSYIFLDEYH 

6395 TCT CCC ACT CCT GAA CAA CTG GCA ATT ATC GGG AAG ATC CAC AGA TTT TCA GAG ACT ATA 
2004 CATPEQLA I IGKXHRFSESI 

6455 AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA GGG TOG GTG ACC ACA ACA GGT CAA AAG CAC 
2024 RVVAMTAT PAGSVTTTGQKH 

6515 CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA ATG AAA GGG GAG GAT CTT GGT AGT CAG TTC 
2044 PI EEF IAP EVMKGEDLGSO/F 

6575 CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG GAT GAG ATG AAA GGC AAT ATG TTG GTT TTT 
2064 LDIAGLKI PVDEMK GNMLVF 

6635 GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA GCA AAG AAG CTA AAA GCT AAG GGC TAT AAC 
2084 VPT RNMAVEVAKKLKAKGYN 

6695 TCT GGA TAC TAT TAC AGT GGA GAG GAT CCA GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC 
2104 SGYYYSGEDPANLRVVTSQS 

6755 CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC 
2124 PYVIVATNAI ESGVTLPDLD 

6815 ACG GTT ATA GAC ACG GGG TTC AAA TGT GAA AAC AGG GTG AGG GTA TCA TCA AAG ATA CCC 
2144 TVI DTCLKCEKRVRVSSK I P 

6875 TTC ATC GTA ACA GGC CTT AAG AGG ATG GCC CTG ACT GTG GGT GAG CAG GCG CAG CCT AGG 
2164 FIVTGLKRMAVTVGEOAQRR 

6935 GGC AGA GTA GGT AGA GTG AAA CCC GGG AGG TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG 
2184 GRVGRVKPGRYYRSQETATG 
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6995 TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG GCA CAA AGA TAG GGG ATT GAG GAT GGA ATC 7054 

2204 SKDYHYDLLQAQRYGI EDGI 2223 

7055 AAC GTG ACG AAA TCC TTT AGG GAG ATC AAT TAC GAT TGC AGC CTA TAC GAG GAG GAC AGC 7114 

2224 NVTKSFREMNYDWSLY EEDS 2243 

7115 CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT AAT CTA CTC ATC TCA GAA GAC TTG CCA GCC 7174 

2244 LLITQLEI LNNLL1 SEDLPA 2263 

7175 OCT GTT AAG AAC ATA ATG GCC AGG ACT GAT CAC CCA GAC CCA ATC CAA CTT GCA TAC AAC 7234 

2264 AVKNIMARTDHPEPIQ LAYN 2283 

7235 ACC TAT GAA GTC CAG GTC CCG GTC CTG TTC CCA AAA ATA AGG AAT GCA GAA GTC ACA GAC 7294 

2284 SYEVOVPVLFPKIRNGEVTD 2303 

7295 ACC TAC GAA AAT TAC TCC TTT CTA AAT GCC AGA AAG TTA QOG GAG GAT GTG CCC GTG TAT 7354 

2304 TYEtJYSFLNAAKLCEDVPVY 2323 

7355 ATC TAC OCT ACT GAA GAT GAG GAT CTC GCA GTT GAC CTC TTA GGG CTA GAC TCC CCT GAT 7414 

2324 2 YATEDEDLAVDLLGLDWPD 2343 

7415 CCT GGG AAC CAG CAG GTA GTG GAG ACT OCT AAA GCA CTG AAG CAA GTG ACC GGG TTC TCC 7474 

2344 PGNQQVVETGKALKOVTGLS 2363 

7475 TCG GCT GAA AAT GCC CTA CTA GTG GCT TTA TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA 7534 

2364 SAENALLVALFGYVGYQALS 2383 

7535 AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA TAT ACC ATC GAG GAC CAG AGA CTA GAA GAC 7594 

2384 KRHVPMITDIYTIBOQRLED 2403 

7595 ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG 7654 

2404 TTH LOYAPNA I XTDGT ETEL 2423 

7655 AAA GAA CTG GCC TCG GGT GAC GTC GAA AAA ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT 7714 

2424 KELASGDVEK I M G A I S OYAA 2443 



in 



7715 GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA GAA AAG ATA AAA ACA GCT CCT TTC TTT AAA 7774 

2444 GGLEFVKSQAEKIKTAPLFK 2463 

7775 GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT 7834 

2464 ENAEAAKGYVQKFIDSL I EN 2483 

7835 AAA GAA GAA ATA ATC AGA TAT GGT TTG TGG GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA 7894 

2484 XEEI I RYGLWGTHTALYKSI 2503 

7895 GCT GCA AGA CTG GGG CAT GAA ACA GOG TTT CCC ACA CTA GTG TTA AAG TGG CTA GCT TTT 7954 J? 

2504 AARLCHETAFATLVLKWLAF 2523 W 

7955 GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG CAG GOG GCA GTT GAT TTA GTG GTC TAT TAT 8014 ^ 

2524 GGESVSDHVKQAAVDLVVYY 2543 

8015 GTG ATG AAT AAG CCT TCC TTC CCA GGT GAC TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC 8074 

2544 VMNKPSFPGDSETQQEGRRF 2563 

8075 GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC 8134 

2564 VASLF I SALATYTYKT WNYH 2583 

8135 AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA 8194 

2584 NLSKVVEPALAYLPYATSAL 2603 

B195 AAA ATG TTC ACC CCA ACG COG CTG GAG AGC GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA 8254 

2604 KMFTPTRLESVVILSTTIYK 2623 

8255 ACA TAC CTC TCT ATA AGG AAG GGG AAG ACT GAT GGA TTG CTG GGT ACG GGG ATA AOT GCA 8314 

2624 TYLSI RKGKS DGLLGTG I SA 2643 

B315 GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA TCG GTA GGT ATA TCT GTG ATG TTG GGG CTA 8374 

2644 AMEILSQNPVSVGISVMLGV 2663 

8375 GGG GCA ATC GCT CCG CAC AAC GCT ATT GAG TCC ACT GAA CAC AAA AGG ACC CTA CTT ATC 8434 

2664 GAIAAHNAIESSEQKRTLLM 2683 

8435 AAG GTG TTT GTA AAG AAC TTC TTG GAT CAG GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC 8494 

2684 KVFVKNFLDQAATDELVKEN 2703 

8495 CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA GCA GTC CAG ACA ATT GGT AAC CCC CTC AGA 8554 

2704 PEKIIMALFEAVQTIGNPLR 2723 

8555 CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG 8614 

2724 L1YHLYGVYYKGWEAK ELSE 2743 

8615 AGG ACA GCA GCC AGA AAC TTA TTC ACA TTC ATA ATG TTT GAA GCC TTC GAC TTA TTA GGG 8674 

2744 RTAGRNLFTLI KFEAF ELLG 2763 

8675 ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG TCC GGA AAT TAC ATT TTG GAT TTC ATA TAC 8734 

2764 MDSQGKIRNLSGNY I LDLIY 2783 
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8735 GCC CTA CAC AAG CAA ATC AAC 
2784 G L H K Q I N 

8795 CCC ITT ACT TGT GAC TOG ACC 
2804 P F S C D W T 

8855 TTG AGG CTA GAA ACC AGG TGC 
2824 L R V E T R C 

8915 GGC AAA CTT ACC AAA GTG GAG 
2844 G K L T K V E 

8975 GGA CCA GTC AAC TAC AGA GTC 
2864 G P V N Y R V 

9035 CTA GCA AAG TTG GAA GGA CAG 
2884 V A K LEGQ 

9095 TAC ACT AAA GGA AAA ATC CTC 
2904 Y S K G K M L 

9155 ACC AGG TTA GCT AAG AGA TAT 
2924 T R L A K R Y 

9215 CCC AAT CAC OCT GCT CTA GTG 
2944 P N H R A L V 

9275 TTT CTA AAA ATC AAG AAG GGG 
2964 F L K M K K G 

9335 AGG CTC ATC GAA CTA GTA CAC 
29B4 R L I E L V H 

9395 GTC ACC ACA TOG CTA GCT TAC 
3004 V T T W L A Y 

9455 CTA GGA GAG AGA GTA ATC CCC 
3024 L G E R V I P 

9515 GTC GAC ACC TCA GAC CTT GGC 
3044 V D T S E V G 

9575 GTG ACA CCT GTC TTG GAA AAA 
3064 V T P V L E K 

9635 ATC GGG TTG GAT GAG GCT AAT 
3084 I G L D E G N 

9695 GAA ATA CAC AAC AGG GAT GCC 
3104 E I H N R 0 A 

9755 TCA AAT AGG GCA AAG ACT GCT 
3124 S N R A K T A 

9815 ATA CGA GAC TTG ATC GCT GCA 
3144 I R O L M A A 

9875 GAG CTG TCT GAA ATG GTC GAT 
3164 E L S E M V D 

9935 CTA ACT CTC GGG CAA CCT AAA 
3184 L S L G O P K 

9995 GAA CAG AAA AAA GAT GTG GAC 
3204 E Q K K D V E 

10055 GAA GTG GCC TTA AAA AAT GAT 
3224 E V A L K N D 

10115 CAA GCT AAA GCA CTT GGG GCC 
3244 0/ & K A L G A 

10175 ACC TAT GCC ATG AAG CTA TCT 
3264 T Y A H K L S 

10235 ACT CCA CTG TTT GAG GAA TTG 
3284 T P L F E E L 

10295 CAC ATG GCA TCA GCT TAC CAA 
3304 H M A S A Y Q 

10355 CAC CTA GGT ACA ATA CCA GCC 
3324 H L G T I PA 

10415 TTG AAA GAT TTC ATA GAA GAA 
3344 L K D F I EE 
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AGA GGG CTG AAG AAA ATG GI'A CTC GGG TCG GCC CCT GCA 8794 

RCLKKMVLGWAPA 2803 

CCT ACT GAC GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT 8854 

PSDERIRLPTDNY 2823 

CCA TGT GGC TAT GAG ATG AAA GCT TTC AAA AAT GTA GGT 8914 

PCGYEMKAFKNVG 2843 

GAG AGC GGG CCT TTC CTA TGT AGA AAC AGA CCT GGT AGG 8974 

ESGPFLCRNR PGR 2863 

ACC AAG TAT TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA 9034 

TKYYDDNLREIKP 2883 

GTA GAG CAC TAC TAC AAA GGG GTC ACA GCA AAA ATT GAC 9094 

VEHYYKGVTAXID 2903 

TTG GCC ACT GAC AAG TGG GAG GTG GAA CAT GGT GTC ATA 9154 

tATDKWEVEHGVI 2923 

ACT GGG GTC GGG TTC AAT GGT GCA TAC TTA GGT GAC GAG 9214 

TCVGFNGAYLGDE 2943 



GAG AGG GAC TGT GCA ACT ATA ACC AAA AAC ACA GTA CAG 
ERDCATI TKNTVQ 

TGT GCC TTC ACC TAT GAC CTC ACC ATC TCC AAT CTG ACC 
CAFTYDLTISNLT 

AGG AAC AAT CTT GAA GAG AAG GAA ATA CCC ACC GCT ACC 
RNNLEEKEI PTAT 

ACC TTC GTG AAT GAA GAC GTA GGG ACT ATA AAA CCA GTA 
TFVNEDVGTI KPV 

GAC CCT GTA CTT GAT ATC AAT TTA CAA CCA GAG GTG CAA 
DPVVDINLQPEVQ 

ATC ACA ATA ATT GCA AGC GAA ACC CTG ATG ACA ACG GGA 
I T I IGRETLMTTG 

GTA GAG CCT GAC GCC AGC GAC AAC CAA AAC TCG GTG AAG 
VEPDASDNQNSVK 

TAC CCA GGC CCT GGA ATA CAG ACA CAT ACA CTA ACA GAA 
YPGPGIQTHTLTE 

AGG CCC TTC ATC ATG ATC CTG GGC TCA AGG AAT TCC ATA 
RPFIMILGSRNSI 

AGA AAT ATA AAT CTG TAC ACA GGA AAT GAC CCC AGG GAA 
RNINLYTGNDPRE 

GGC CCC ATG TTA GTA GTA GCA CTG AGG GAT GTC GAC CCT 
GRMLVVALRDVDP 

TTC AAG GGG ACT TTT TTA GAT AGG GAC GCC CTG GAG GCT 
FKGTFLDREALEA 

COG AAG CAG CTT ACC AAG GAA GCT GTT ACG AAT TTG ATA 
PKQVTKEAVRNLI 
ATC CCT AAC TGG TTT GCA TCA CAT GAC CCA GTA TTT CTG 
I PWWFASDDPVFL 

AAG TAC TAC TTA GTA GGA GAT GTT CGA GAC CTA AAA GAT 
KYYLVGDVGELKD 

ACG GAT CAG ACA AGA ATT ATA AAG GAG GTA GGC TCA AGG 
TDQTRI I KEVGSR 

ACC TGG TTC CTC AAG GCA TCA AAC AAA CAG ATG ACT TTA 
SWFLKASNKQMSL 

TTG CTA COG TCC CCA CCT GCA ACT AAG ACC AAT AAG GGC 
t» L R C PPATKSNKG 

TTG GCA CAG GGT AAC TGG GAG CCC CTC GGT TGC GGG GTG 
LAQGNWE PLGCGV 

AGA AGC GTG AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG 
RRVKIHPYEAYLK 

GAA GAG AAG AAA CCT AGG GTT AAG GAT ACA GTA ATA AGA 
EEKKPRVKDTVI R 



9274 
2963 



9334 
2983 



9394 
3003 



9454 
3023 



9514 
3043 



9574 
3063 



9634 
3083 



9694 
3103 



9754 
3123 



9814 
3143 



9874 
3163 



9934 
3183 



9994 
3203 



10054 
3223 



10114 
3243 



10174 
3263 



10234 
3283 



10294 
3303 



10354 
3323 



10414 

3343 



10474 
3363 



P 
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10475 GAG CAC AAC AAA TGG ATA CTT AAA AAA ATA AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA 10534 

3364 EHNKWILKKIRFQCNLNTKK 3383 

10535 ATG CTC AAC CCC GGC AAA CTA TCT GAA CAC TTG GAC AGG GAG GGG CGC AAG AGG AAC ATC 10594 

3384 KLNPGKL SEQLDREGRKRNI 3403 

10595 TAC AAC CAC CAG ATT GGT ACT ATA ATC TCA ACT GCA GGC ATA AGG CTG GAG AAA TTG CCA 10654 

3404 YNHQIGT IMSSAGIRLEK LP 3423 

10655 ATA GTC AGG CCC CAA ACC GAC ACC AAA ACC TTT CAT GAC GCA ATA AGA GAT AAG ATA GAC 10714 

3424 IVRAQTDTKTFHEAI R D K I D 3443 

10715 AAG ACT GAA AAC CGC CAA AAT CCA GAA TTG CAC AAC AAA TTG TTG GAG ATT TTC CAC ACG 10774 

3444 KSENRQN PELHNKLLEJFHT 3463 

10775 ATA GCC CAA CCC ACC CTG AAA CAC ACC TAC GGT GAG CTG ACG TOG GAG CAA CTT GAG CCC 10834 

3464 I AQPTLK HTYGEVTWEQL EA 3483 

10835 GGG ATA AAT AGA AAG GGG GCA GCA GGC TTC CTG GAG AAG AAG AAC ATC GGA GAA GTA TTG 10894 

3484 CINRKGAAGFLEKKNIGEVL 3503 

10895 GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG GTC AGG GAT CTG AAG GCC GGG AGA AAG ATA 10954 

3504 DSEKHLV EQLVRDLKAGR K I 3523 

10955 AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT GAG AAG AGA GAT GTC ACT GAT GAC TGG CAG 11014 

3524 KYYETAI PKNEKRDVSDDWQ 3543 

11015 GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA 11074 

3544 AC DLVVE KRPRVIQYPEAKT 3563 

11075 ACG CTA GCC ATC ACT AAC GTC ATG TAT AAC TOG GTG AAA CAG CAC CCC GTT GTG ATT CCA 11134 

3564 RLAITKVMYNWVKQQPVV I P 3583 

11135 GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC 11194 

3584 GYEGKTPLFNI FDKVRKEWO 3603 

11195 TCG TTC AAT GAG CCA GTG GCC GTA ACT TTT GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT 11254 

3604 SFNEPVAVSFDTKAWDTQVT 3623 t"j* 

11255 ACT AAG GAT CTG CAA CTT ATT GGA GAA ATC CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC 11314 ^ 

3624 SKDLQLI GEIQKYYYKKBWH 3643 ^ 

1 

P 

11375 GAA GTA TAT ATA AGA AAT GGG CAG AGA GGG AGC GGC CAG CCA GAC ACA ACT OCT GCC AAC 11434 {J 



11315 AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG ACA GAA GTA CCA GTT ATA ACA GCA GAT GGT 11374 
3644 K F I OTITDHMTEVPV ITAOG 3663 



3664 EVY I RNGQRG SGQPDTSAGN 3683 

11435 AGC ATG TTA AAT GTC CTG ACA ATG ATG TAC GGC TTC TGC GAA AGC ACA GGG GTA CCC TAC 11494 

3664 SMLNVLTMMYGFCESTGVPY 3703 

11495 AAG ACT TTC AAC AGG GTG GCA AGG ATC CAC GTC TCT GGG GAT GAT GGC TTC TTA ATA ACT 11554 

3704 KSFNRVARIHVCGDDGFLIT 3723 

11555 GAA AAA GGG TTA GGG CTG AAA TTT GCT AAC AAA GGG ATG CAG ATT CTT CAT GAA GCA GGC 11614 

3724 EKG LGLK F ANKGMQI LHEAG 3743 

11615 AAA CCT CAG AAC ATA ACG GAA GGG GAA AAG ATG AAA GTT GCC TAT AGA TTT GAG GAT ATA 11674 

3744 KPQKITECEKMKVAYRFEDI 3763 

11675 GAC TTC TGT TCT CAT ACC CCA GTC CCT GTT AGG TGG TCC GAC AAC ACC AGT ACT CAC ATG 11734 

3764 EFCSHTPVPVRWSDNTSSHM 3783 

11735 GCC GGG AGA GAC ACC CCT GTG ATA CTA TCA AAG ATG GCA ACA AGA TTG GAT TCA AGT CCA 11794 

3784 AGRDTAVILSKHATRL.DSSG 3803 

11795 GAG AGG GGT ACC ACA GCA TAT GAA AAA GCC GTA GCC TTC AGT TTC TTG CTG ATG TAT TCC 11854 

3B04 ERGTTAY EKAVAFSFLLMY S 3823 

11855 TGG AAC CCC CTT GTT AGG AGG ATT TGC CTG TTG GTC CTT TCG CAA CAC CCA GAG ACA GAC 11914 

3824 WNPUVRRICLLVLSQOPETO 3843 

11915 CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA GGT GAT CCA ATA GGG GCC TAT AAA GAT GTA 11974 

3844 PSKHATY YYKGDPIGAYKDV 3863 

11975 ATA GCT CCC AAT CTA AGT GAA CTC AAG AGA ACA GGC TTT GAG AAA TTG GCA AAT CTA AAC 12034 

3864 IGRNLSELKRTGFEKLANLN 38B3 

12035 CTA AGC CTC TCC ACG TTG GGG ATC TGC ACT AAG CAC ACA AGC AAA ACA ATA ATT CAG GAC 12094 

3884 LSLSTLG IWTKHTSKRI I Q D 3903 

12095 TGT GTT GCC ATT GGG AAA GAA GAG GGC AAC TCG CTA GTT AAC GCC GAC AGG CTG ATA TCC 12154 

3904 CVAIGKEECNWLVNADRLIS 3923 

12155 AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT AAA GGC TTT ACA TTA CAA GGA AAG CAT TAT 12214 

3924 SKTCHLY I PDKGFTLQGKHY 3943 



E 
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12215 GAG CAA CTG CAG CTA AGA ACA GAG ACA AAC CCG GTC ATG GGG GTT GGG ACT GAG AGA TAC 12274 
3944 EQLQLRTETNPVMCVCTERY 3963 

12275 AAG TTA GGT CCC ATA GTC AAT CTG CTG CTG AGA AGG TTG AAA ATT CTG CTC ATG ACC GCC 12334 
3964 KLGP 1 VNLLLRRLKl LLMTA 39B3 

12335 GTC GCC GTC AGC ACC TCA gacaaaatgtacatattgtaaataaat caatccacgtacatagtgtatataaatat 12408 
1984 V C, V S S ' 3989 

12409 agctgggaccgcccaccccaagaagacgacacgcccaacacgcacagccaaacagcagccaagaccatctaccccaagat 12488 

12489 aacaccacacccaacgcacacagcacctcagccgtacgaggatacgcccgacgtctatagctggaccagggaagacccct 12568 , 

12569 aacagccccc 12578 
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BVDV NADL clns- (inf. clone) -> Genes 

DNA sequence 12308 b.p. gtatacgagaat ... ctaacagccccc linear 

1 gcatacgagaattagaaaaggcactcgcacacocactgggcaatcaaaaataacaaccaggcccagggaacaaacccccc 80 

8 1 tcagcgaaggccgaaaagaggccagccacgccctcagtaggaccagcacaacgaggggggcagcaacagtggtgagcccg 1 6C 

161 ttggacggcttaagccctgagtacagggcagtcgccagtggttcgacgccttggaataaaggtcccgagacgccacgcgg 240 

241 acgagggcacgcccaaagcacatcccaacctgagcgggggtcgcccaggcaaaagcagtt ccaaccgactgtcacgaaca 320 

321 cagcctgacagggcgccgcagaggcccactgcattgccaccaaaaatctctgctgcacacggcac ATG GAG TTG 394 
1 MEL 3 

395 ATC ACA AAT GAA CTT TTA TAG AAA ACA TAC AAA CAA AAA CCC GTC GGG GTG GAG GAA CCT 454 
4ITNELLYKTYKQKPVGVEEP 23 

455 GTT TAT GAT CAG GCA GGT GAT CCC TTA TTT GGT GAA AGG GGA GCA GTC CAC CCT CAA TCG 514 
24VYDQ AGD PLFGERGAVH PQS 43 

S15 ACG CTA AAG CTC CCA CAC AAG AGA GGG GAA CGC GAT GTT CCA ACC AAC TTG GCA TCC TTA 574 
44 T L K L PHKRGERDVPTNLASL 63 

575 CCA AAA AGA GGT GAC TGC AGG TCG GGT AAT AGC AGA GGA CCT GTG AGC GGG ATC TAC CTG 634 
64PKRG DCR SGNSRGPVSG I Y L 83 

635 AAG CCA GGG CCA CTA TTT TAC CAG GAC TAT AAA GGT CCC GTC TAT CAC ACG GCC CCG CTG 694 
84KPGPLFY0DYKGPVYHRAPL 103 

695 GAG CTC TTT GAG GAG GGA TCC ATG TGT GAA ACG ACT AAA CGG ATA GGG AGA GTA ACT GGA 754 
104ELFEEGSMCETTKRIGRVTC 123 

755 ACT GAC GGA AAG CTC TAC CAC ATT TAT GTC TGT ATA GAT GGA TGT ATA ATA ATA AAA ACT 814 
124 SDGK LYHIYVCIDGCI I IKS 143 

815 GCC ACG AGA ACT TAC CAA AGG GTG TTC AGG TOG GTC CAT AAT AGG CTT GAC TGC CCT CTA 874 
144 A T R S Y Q R V F R W V H N R L D C P L 163 

875 TGG GTC ACA ACT TGC TCA GAC ACG AAA GAA GAG GGA GCA ACA AAA AAG AAA ACA CAG AAA 934 
164WVTTCSDTKEEGATKKKT0K1B3 
935 CCC GAC AGA CTA GAA AGG GGG AAA ATG AAA ATA GTC CCC AAA GAA TCT GAA AAA GAC AGC 994 



-> vw* «uu uwo /ww *wv» #w\ f\ki\ VW\A uAA TCT GAA AAA GAC AGC 994 r to 

184 PDR L ERGKMKI VPKESEKDS 203 W 

995 AAA ACT AAA CCT CCG GAT GCT ACA ATA GTG GTG GAA GGA GTC AAA TAC CAG GTG AGG AAG 1054 ^ 

204 KTKPPDATIVVEGVKYQVRK 223 

1055 AAG GGA AAA ACC AAG ACT AAA AAC ACT CAG GAC GCC TTC TAC CAT AAC AAA AAC AAA CCT 1114 

224 KGKTKSKNTQDGLYHNKNKP 243 

1115 CAG GAA TCA CGC AAG AAA CTG GAA AAA GCA TTG TTG GCG TGG GCA ATA ATA GCT ATA GTT 1174 

244 QESRKKLEKALLAWAI X A I V 263 

1175 TTG TTT CAA GTT ACA ATG GGA GAA AAC ATA ACA CAG TGG AAC CTA CAA GAT AAT GGG ACG 1234 

264 LFQVTMGENI TQWNLQDNCT 283 

1235 GAA GGG ATA CAA COG GCA ATC TTC CAA AOG GOT GTC AAT AGA ACT TTA CAT GGA ATC TGG 1294 

284 EGIQRAMFQRGVNRSLHGIW 303 

1295 CCA GAG AAA ATC TGT ACT GGT GTC CCT TCC CAT CTA GCC ACC GAT ATA GAA CTA AAA ACA 1354 

304 PEK I CTGVPSH LATDI ELKT 323 

1355 ATT CAT GGT ATC ATG GAT GCA ACT GAG AAG ACC AAC TAC ACG TCT TGC AGA CTT CAA CGC 1414 

324 IHCMMDASEKTNYTCCR LQR 3a3 

1415 CAT GAG TGG AAC AAG CAT GGT TGG TGC AAC TGG TAC AAT ATT GAA CCC TGG ATT CTA GTC 1474 

344 HEWNKHGWCNWYNIEPW1LV 363 

1475 ATG AAT AGA ACC CAA GCC AAT CTC ACT GAG GGA CAA CCA CCA AGG GAG TGC GCA CTC ACT 1534 

364 MNRTQANLTEGQPPRECAVT 383 

1535 TGT AGG TAT GAT AGG GCT ACT GAC TTA AAC GTG GTA ACA CAA GCT AGA GAT AGC CCC ACA 1594 

384 CRYDRASDLNVVTQAROSPT 403 

1595 CCC TTA ACA GGT TGC AAG AAA GGA AAC AAC TTC TCC TTT GCA GCC ATA TTG ATG CGG GGC 1654 

404 PLTGCKKGKNFSFAGI LMRC 423 

1655 CCC TGC AAC TTT GAA ATA GCT GCA ACT GAT GTA TTA TTC AAA GAA CAT GAA CGC ATT ACT 1714 

424 PCNFEI AASDVLFKEHERIS 443 

1715 ATC TTC CAG GAT ACT ACT CTT TAC CTT GTT GAC GGG TTC ACC AAC TCC TTA GAA GGT GCC 1774 

444 MFQ DTT LYLVDGLTNS LEGA 463 
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1775 AGA CAA CCA ACC OCT AAA CTC ACA ACC TGG TTA CGC AAG CAC CTC CGG ATA CTA GGA AAA 1831 

464 RQCTAKLTTWLGKQ LG I LGK 483 

1835 AAG TTG GAA AAC AAG AGT AAG ACC TGG TTT GGA GCA TAC GCT OCT TCC CCT TAC TOT GAT 1894 

484 KLENKSKTWFGAYAA5 P Y C D 503 

1895 GTC GAT CGC AAA ATT GGC TAC ATA TGG TAT ACA AAA AAT TGC ACC CCT GCC TCC TTA CCC 1954 

504 VDRKIGY IWYTKNCTPACLP 523 

1955 AAG AAC ACA AAA ATT GTC GGC CCT GGG AAA TTT CAC ACC AAT GCA GAG GAC GGC AAG ATA 2014 

524 KNTKIVGPGKFDTNAEDGKZ 543 

2015 TTA CAT GAC ATG GGG GGT CAC TTG TCG GAG CTA CTA CTA CTT TCT TTA GTG CTC CTG TCC 2074 

544 LHEMGCH LSEVLLLS LVVLS 563 

2075 GAC TTC GCA CCG GAA ACA GCT AGT CTA ATG TAC CTA ATC CTA CAT TTT TCC ATC CCA CAA 2134 

564 DFAPETASVMYLILHFSIPQ 583 

2135 ACT CAC CTT GAT GTA ATG GAT TGT GAT AAG ACC CAC TTG AAC CTC ACA GTG GAG CTG ACA 2194 

584 SHVDVMDCDKTQL.NLT VELT 603 

2195 ACA GCT GAA GTA ATA CCA GGG TCG GTC TGG AAT CTA GGC AAA TAT GTA TGT ATA AGA CCA 2254 

604 TAEV1 PGSVWNLGKYVC I RP 623 

2255 AAT TGG TGG CCT TAT GAG ACA ACT GTA GTG TTG GCA TTT GAA GAG GTG AGC CAG GTG GTG 2314 

624 NWWP Y ETTVVLAF E EV SQVV 643 

2315 AAG TTA GTG TTC AGG GCA CTC AGA GAT TTA ACA CGC ATT TGG AAC GCT GCA ACA ACT ACT 2374 

644 KLVLRALRDLT RIWNAATTT 663 

2375 GCT TTT TTA GTA TGC CTT GTT AAG ATA GTC AGG GGC CAG ATG GTA CAG GGC ATT CTG TGC 2434 

664 AFLVCLVKIVRGQMVQGI LW 6B3 

2435 CTA CTA TTG ATA ACA GGG GTA CAA GGG CAC TTG GAT TGC AAA CCT GAA TTC TCC TAT GCC 2494 

6B4LLLITGVQ GHLDCKPEFSYA 703 

2495 ATA GCA AAG GAC GAA AGA ATT GGT CAA CTG GGG GCT GAA GGC CTT ACC ACC ACT TGG AAG 2554 

704 I AKDERIGQLGAEGLTTTWK 723 



i 



2555 GAA TAC TCA CCT GGA ATG AAG CTG GAA GAC ACA ATG GTC ATT GCT TGG TGC GAA GAT GGG 2614 

724 EYSPGMK LEOTHVX AWCEDG 743 

2615 AAG TTA ATC TAC CTC CAA AGA TGC ACG AGA GAA ACC AGG TAT CTC GCA ATC TTG CAT ACA 2674 

744 KLMY LQRCTRETR Y LA I LHT 763 

2675 AGA GCC TTG CCG ACC AGT GTG GTA TTC AAA AAA CTC TTT GAT GGG GGA AAG CAA GAG GAT 2734 

764 RALPTSVVFKKLFDGR KQED 783 M 

2735 GTA GTC GAA ATG AAC GAC AAC TTT GAA TTT GGA CTC TGC CCA TGT GAT GCC AAA CCC ATA 2794 g 

784 VVEMNDNFEFGLC PCDAKPI 803 

2795 GTA AGA GGG AAG TTC AAT ACA ACG CTG CTG AAC GGA CCG GCC TTC CAG ATC 2854 Z. 

804 VRGKFNTTLLNGPAFQMVCP 623 

2855 ATA GGA TGG ACA GGG ACT GTA AGC TGT ACG TCA TTC AAT ATG GAC ACC TTA GCC ACA ACT 2914 

824 IGWTGTVSCTSFNMDT LATT 843 

2915 GTG GTA CGG ACA TAT AGA AGG TCT AAA CCA TTC CCT CAT AGG CAA GGC TGT ATC ACC CAA 2974 

844 VVRTYRR SKPFPH RQGC ITQ 863 

2975 AAG AAT CTG GGG GAG GAT CTC CAT AAC TGC ATC CTT GGA GGA AAT TGG ACT TGT GTG CCT 3034 

864 KNLGEDLHNCI LGGNWTCVP 883 

3035 GGA GAC CAA CTA CTA TAC AAA GGG GGC TCT ATT GAA TCT TGC AAG TGG TGT GGC TAT CAA 3094 

884 GOQLLYKGGSI ESCKWCGYQ 903 

3095 TTT AAA GAG AGT GAG GGA CTA CCA CAC TAC CCC ATT GGC AAG TGT AAA TTG GAG AAC GAG 3154 

904 FKESEGLPHYPIGKCK LENE 923 

3155 ACT GOT TAC AGG CTA GTA GAC AGT ACC TCT TCC AAT AGA GAA GGT GTG GCC ATA GTA CCA 3214 

924 TGYRLVDSTSCNR EGVA IV P 943 

3215 CAA GGG ACA TTA AAG TCC AAG ATA GGA AAA ACA ACT GTA CAG GTC ATA GCT ATG GAT ACC 3274 

944 QGTLKCK IGKTTVQV I AMDT 963 

3275 AAA CTC GGA CCT ATC CCT TGC AGA CCA TAT GAA ATC ATA TCA AGT GAG GGG CCT GTA GAA 3334 

964 KLG PMPCRPYEI I SSEGPVE 983 

3335 AAG ACA GCC TGT ACT TTC AAC TAC ACT AAG ACA TTA AAA AAT AAG TAT TTT GAG CCC AGA 3394 

984 KTACTFNYTKTLKNKY FEPR 1003 

3395 CAC AGC TAC TTT CAG CAA TAC ATC CTA AAA GGA GAG TAT CAA TAC TGG TTT GAC CTG GAG 3454 

1004 DSYFOQYMLKGEY QYWFDLE 1023 

3455 GTG ACT GAC CAT CAC CGG GAT TAC TTC CCT CAG TCC ATA TTA GTC GTG GTA GTA GCC CTC 3514 

1024 VTDHHROYFAESI LVVVVAL 1043 
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3515 TTG OCT OGC AGA TAT GTA CTT TOG TTA CTC GTT ACA TAC ATC GTC TTA TCA GAA CAC AAG 3574 

1044 LCGRYVLWLLVTYMVLSEQK 1063 

3575 GCC TTA OGC ATT CAC TAT GGA TCA GOG GAA GTC GTC ATG ATC OGC AAC TTG CTA ACC CAT 3634 

1064 ALCI QYGSGEVVMNCNLLTH 1083 



3635 AAC AAT ATT GAA GTC GTC ACA TAC TTC TTG CTG CTG TAC CTA CTC CTC AGG GAG GAG AGC 
1084 NNIEVVTYFLLLYLLLREES 



3694 
1103 



3695 GTA AAG AAG TOG GTC TTA CTC TTA TAC CAC ATC TTA GTG GTA CAC CCA ATC AAA TCT GTA 3754 

1104 VKKWVLLLYHILVVHPIKSV 1123 

3755 ATT GTG ATC CTA CTC ATC ATT GOG GAT GTC GTA AAG GCC GAT TCA OOG OGC CAA GAG TAC 3814 

1124 IVIL.LMIGDVVKAOSGGQ.EY 1143 

3815 TTG GGG AAA ATA GAC CTC TOT TTP ACA ACA CTA GTA CTA ATC GTC ATA OCT TTA ATC ATA 3874 

1144 LCKI DLCFTTVVLIVIGLI I 1163 

3875 GCC AGC CGT GAC CCA ACT ATA GTG CCA CTG GTA ACA ATA ATG GCA OCA CTG AGG GTC ACT 3934 

1164 ARRDPTIVPLVTIMAALRVT 1183 

3935 GAA CTG ACC CAC CAC OCT GGA GTT GAC ATC GCT GTC GOG GTC ATG ACT ATA ACC CTA CTG 3994 

1184 ELTHQPGVDI AVAVMTITLL 1203 

3995 ATG GTT AGC TAT GTG ACA GAT TAT TTT AGA TAT AAA AAA TOG TTA CAG TGC ATT CTC AGC 4054 

1204 MVSYVTDYFRYKKWLQCI L S 1223 

4055 CTG GTA TCT OCG GTG TTC TTG ATA AGA AGC CTA ATA TAC CTA GCT AGA ATC GAG ATC CCA 4114 

1224 LVSAVFLIRSLIYLGRIEMP 1243 

4115 GAG GTA ACT ATC CCA AAC TOG AGA CCA CTA ACT TTA ATA CTA TTA TAT TTC ATC TCA ACA 4174 

1244 EVTI PNWRPLTLI LLYLI ST 1263 

4175 ACA ATT GTA ACG AGO TOG AAG GTT GAC GTG GCT OGC CTA TTC TTC CAA TGT GTG CCT ATC 4234 

1264 TIVTRWKVDVAGLLLQCVPI 1283 

4235 TTA TTG CTC GTC ACA ACC TTG TOG GCC GAC TTC TTA ACC CTA ATA CTC ATC CTC CCT ACC 4294 

1284 LLLVTTLWADFLTLILI LPT 1303 

4295 TAT GAA TTC GTT AAA TTA TAC TAT CTC AAA ACT GTT AGG ACT GAT ATA GAA AGA ACT TOG 4354 

1304 YELVKLYYLKTVRTDIERSW 1323 



4355 CTA GGG GGG ATA GAC TAT ACA AGA GTT GAC TCC ATC TAC GAC GTT GAT GAG ACT GGA GAG 4414 
1324 LGGI DYTRVDS IY DVDESGE 1343 



4415 OGC CTA TAT CTT TTT CCA TCA AGG CAG AAA GCA CAG GGG AAT TTT TCT ATA CTC TTC CCC 4474 
1344 GVYLFPSRQKAQGNFSI LLP 1363 



4475 CTT ATC AAA GCA ACA CTG ATA ACT TGC GTC AGC ACT AAA TGC CAG CTA ATA TAC ATC ACT 
1364 LIKATLI SCVSSKWQLIYMS 



4534 
1383 



4535 TAC TTA ACT TTG GAC TTT ATG TAC TAC ATG CAC AGG AAA GTT ATA GAA GAG ATC TCA GCA 4594 

1384 Y LTLDFMYYMHRKVIEE I SG 1403 

4595 GGT ACC AAC ATA ATA TCC AGG TTA GTG GCA GCA CTC ATA GAG CTC AAC TOG TCC ATG GAA 4654 

1404 GTNI ISRLVAALI ELNWSME 1423 

4655 GAA GAG GAG AGC AAA GCC TTA AAC AAG TTT TAT CTA TTG TCT GGA AGG TTC AGA AAC CTA 4714 

1424 EEESKGLKKFYLLSGRLRNL 1443 

4715 ATA ATA AAA CAT AAG GTA AGG AAT GAG ACC GTC GCT TCT TOG TAC OOG GAG GAG GAA GTC 4774 

1444 I IKHKVRNETVASWYGEEEV 1463 

4775 TAC OCT ATG CCA AAG ATC ATC ACT ATA ATC AAG GCC ACT ACA CTG ACT AAG AGC AGG CAC 4834 

1464 YGMPKIMTI I KASTLSKSRH 1483 

4835 TGC ATA ATA TGC ACT GTA TGT GAG OGC CCA GAG TOG AAA GGT OGC ACC TGC CCA AAA TGT 4894 

1484 CI I CTVCEGREWKGGTCPKC 1503 

4895 GGA CCC CAT GGG AAG OCG ATA ACC TGT GGG ATC TOO CTA GCA GAT TTT GAA GAA AGA CAC 4954 

1504 GRHGKPI TCGMSLADFEERH 1523 

4955 TAT AAA AGA ATC TTT ATA AGG GAA GCC AAC TTT GAG gggCCC TTC AGG CAG GAA TAC AAT 5014 

1524 YKR I FIREGNFE FRQEYN 1541 

5015 GGC TTT CTA CAA TAT ACC CCT ACC COG CAA CTA TTT CTC AGA AAC TTG CCC GTA CTC GCA 5074 

1542 GFVQYTARGQLFLRNLPVLA 1561 

5075 ACT AAA GTA AAA ATC CTC ATG GTA GGC AAC CTT GGA GAA GAA ATT GGT AAT CTG GAA CAT 5134 

1562 TKVKMLMVGNLGEEIGNLEH 1581 



P 



5135 CTT GOO TOG ATC CTA AGG GGG CCT GCC GTC TCT AAG AAG ATC ACA GAG CAC GAA AAA TGC 5194 

1582 LGWI LRGPAVCKK ITEHEKC 1601 

5195 CAC ATT AAT ATA CTG GAT AAA CTA ACC GCA TTT TTC GGG ATC ATG CCA AGG GGG ACT ACA 5254 

1602 H1NI LDK LTAFFG IMPRGTT 1621 
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5255 CCC AGA GCC CCC GTG ACG TTC CCT ACG AGC TTA CTA AAA GTG AGG ACG GCT CTG GAG ACT 5314 

1622 PRAPVRFPTSLLKVRRGLET 1641 

5315 GCC TGG GCT TAC ACA CAC CAA GGC GGG ATA ACT TCA GTC GAC CAT CTA ACC GCC GGA AAA 5374 

1642 AWAYTHQGGI SSVDHVTAGK 1661 

5375 GAT CTA CTG GTC TGT GAC AGC ATG GGA CGA ACT AGA GTG GTT TGC CAA AGC AAC AAC ACG 5434 

1662 DLLVCDSMGRTRVVCQSNNR 1681 

5435 TTG ACC GAT GAG ACA GAG TAT GGC GTC AAG ACT GAC TCA GGG TGC CCA GAC GGT GCC AGA 5494 

1682 LTOET EYCVKTDSCC POGAR 1701 

5495 TGT TAT GTG TTA AAT CCA GAG GCC GTT AAC ATA TCA GGA TCC AAA GGG GCA GTC GTT CAC 5554 

1702 CYVLNPEAVNISG SKGAVVH 1721 

5555 CTC CAA AAG ACA GGT GGA GAA TTC ACC TGT GTC ACC GCA TCA GGC ACA CCG GCT TTC TTC 5614 

1722 LQKTGGEFTCVTASGTPAFF 1741 

5615 GAC CTA AAA AAC TTG AAA GGA TGG TCA GGC TTG CCT ATA TTT GAA GCC TCC AGC GGG AGG 5674 

1742 DLKNLKGWSGLPI FEASSGR 1761 

5675 GTG GTT GGC AGA GTC AAA GTA GGG AAG AAT GAA GAG TCT AAA CCT ACA AAA ATA ATG ACT 5734 

1762 VVGRVKVGKNEESKPTKIMS 1781 

5735 GGA ATC CAG ACC GTC TCA AAA AAC AGA CCA GAC CTG ACC GAG ATG GTC AAG AAG ATA ACC 5794 

1782 GIQTVSKNRADLTEMVKKIT 1801 

5795 AGC ATG AAC AGG GGA GAC TTC AAG CAG ATT ACT TTG GCA ACA GGG GCA GGC AAA ACC ACA S854 

1802 SMNRGDFKOITLATGAGKTT 1821 

5855 GAA CTC CCA AAA GCA GTT ATA GAG GAG ATA GGA AGA CAC AAG AGA GTA TTA GTT CTT ATA 5914 

1822 ELPRAVIEE IGRHKRVLVLI 1841 

5915 CCA TTA AGG GCA GOG GCA GAG TCA GTC TAC CAG TAT ATG AGA TTG AAA CAC CCA AGC ATC 5974 

1842 PLRAAAESVYQYMRLKHPSI 1861 

5975 TCT TTT AAC CTA AGG ATA GGG GAC ATG AAA GAG GGG GAC ATG GCA ACC GGG ATA ACC TAT 6034 

1862 SFNLRIGDMKEGDMATGITY 1881 

6035 GCA TCA TAC GGG TAC TTC TGC CAA ATG CCT CAA CCA AAG CTC AGA GCT GCT ATG GTA GAA 6094 

1882 ASYGYFCQMPQPKLRAAMVE 1901 

6095 TAC TCA TAC ATA TTC TTA GAT GAA TAC CAT TGT GCC ACT CCT GAA CAA CTG GCA ATT ATC 6154 

1902 Y SY I F LDEYHCAT PEQLAI I 1921 ^ 

6155 GOG AAG ATC CAC AGA TTT TCA GAG ACT ATA AGG GTT GTC GCC ATG ACT GCC ACG CCA GCA 6214 

1922 GK IHRFSES IRVVAMTATPA 1941 

6215 GGG TOG GTG ACC ACA ACA GGT CAA AAG CAC CCA ATA GAG GAA TTC ATA GCC CCC GAG GTA 6274 jji 

1942 GSVTTTGQKHPI EEF 1APEV 1961 y 

6275 ATG AAA GGG GAG GAT CTT GGT ACT CAG TTC CTT GAT ATA GCA GGG TTA AAA ATA CCA GTG 6334 ^ 

1962 MKGEDLGSQFLDI AG LKIPV 1981 

6335 GAT GAG ATG AAA GGC AAT ATC TTG GTT TTT GTA CCA ACG AGA AAC ATG GCA GTA GAG GTA 6394 

1982 DEMKGNMLVFVPTRNMAVEV 2001 

6395 GCA AAG AAC CTA AAA GCT AAG GGC TAT AAC TCT GGA TAC TAT TAC ACT GGA GAG GAT CCA 6454 

2002 AKKLKAKGYNSGYYYSGEDP 2021 

6455 GCC AAT CTG AGA GTT GTG ACA TCA CAA TCC CCC TAT GTA ATC GTG GCT ACA AAT GCT ATT 6514 

2022 ANLRVVTSQSPYVIVATNAI 2041 

6515 GAA TCA GGA GTG ACA CTA CCA GAT TTG GAC ACG GTT ATA CAC ACG GGG TTG AAA TGT GAA 6574 

2042 ESGVTLPDLDTV I DTGLKCE 2061 

6575 AAG AGG GTG AGG GTA TCA TCA AAG ATA CCC TTC ATC GTA ACA GGC CTT AAG AGG ATG GCC 6634 

2062 KRVRVSSKI PFI VTCLKRMA 2081 

6635 GTG ACT GTG GGT GAG CAG GCC CAG CCT AGG GGC AGA GTA GGT AGA GTG AAA CCC GOG AGG 6694 

2082 VTVGEQAQRRGR VGRVKPGR 2101 

6695 TAT TAT AGG AGC CAG GAA ACA GCA ACA GGG TCA AAG GAC TAC CAC TAT GAC CTC TTG CAG 6754 

2102 Y Y RSQETATGSK DYH YDLLQ 2121 

6755 GCA CAA AGA TAC GGG ATT GAG GAT GGA ATC AAC GTG ACG AAA TCC TTT AGG GAG ATG AAT 6814 

2122 AQRYGI EDGINVTKSFREMN 2141 

6815 TAC GAT TGG AGC CTA TAC GAG GAG GAC AGC CTA CTA ATA ACC CAG CTG GAA ATA CTA AAT 6874 

2142 Y DWS LYEEDSLLI TQLEI L N 2161 

6875 AAT CTA CTC ATC TCA GAA CAC TTC CCA GCC GCT CTT AAG AAC ATA ATG GCC AGG ACT GAT 6934 

2162 NLLI SEDLPAAVKNI MARTD 2181 

6935 CAC CCA GAG CCA ATC CAA CTT GCA TAC AAC AGC TAT GAA GTC CAG GTC CCG GTC CTG TTC 6994 

2182 H PEP J OLAYNSYEVQVPVLF 2201 
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6995 CCA AAA ATA AOC AAT GGA GAA GTC ACA GAC ACC TAC GAA AAT TAC TCG TTT CTA AAT CCC 7054 

2202 PKI RNGEVTDTYENY S F L N A 2221 

7055 AGA AAG TTA GOG GAG GAT GTG CCC GTG TAT ATC TAC GCT ACT GAA GAT GAG GAT CTG CCA 7114 

2222 RKLGEDVPVYIYATEDEDLA 2241 

7115 GTT GAC CTC TTA GGG CTA GAC TOG CCT GAT CCT GGG AAC CAG CAG GTA GTG GAG ACT GGT 7174 

2242 VDLLGLDWPDPGNQOVVETC 2261 

7175 AAA GCA CTG AAG CAA GTG ACC GGG TTG TCC TOG GCT GAA AAT GCC CTA CTA GTG GCT TTA 7234 

2262 KALKQVTCLS SAENALLVAL 2281 

7235 TTT GGG TAT GTG GGT TAC CAG GCT CTC TCA AAG AGG CAT GTC CCA ATG ATA ACA GAC ATA 7294 

2282 FGYVGYQALSKRHVPMITDI 2301 

7295 TAT ACC ATC GAG GAC CAG ACA CTA GAA GAC ACC ACC CAC CTC CAG TAT GCA CCC AAC GCC 7354 

2302 YTI EDQRLEDTTHLQYA PNA 2321 

7355 ATA AAA ACC GAT GGG ACA GAG ACT GAA CTG AAA GAA CTG GCG TCG GGT GAC GTG GAA AAA 7414 

2322 1KTDGTETELKELASG0VEK 2341 

7415 ATC ATG GGA GCC ATT TCA GAT TAT GCA GCT GGG GGA CTG GAG TTT GTT AAA TCC CAA GCA 7474 

2342 IHGAZ SDYAAGGLEFVKSQA 2361 

7475 GAA AAG ATA AAA ACA GCT CCT TTG TTT AAA GAA AAC GCA GAA GCC GCA AAA GGG TAT GTC 7534 

2362 EK I KTAPLFKE-NAEAAKG YV 2381 

7535 CAA AAA TTC ATT GAC TCA TTA ATT GAA AAT AAA GAA GAA ATA ATC AGA TAT GGT TTG TOG 7594 

2382 QKFIDSLIENKEEI IRYGLW 2401 

7595 GGA ACA CAC ACA GCA CTA TAC AAA AGC ATA GCT GCA AGA CTG GGG CAT GAA ACA GCG TTT 7654 

2402 GTHTALYKSIAARLGKETAF 2421 

7655 GCC ACA CTA GTG TTA AAG TOG CTA GCT TTT GGA GGG GAA TCA GTG TCA GAC CAC GTC AAG 7714 

2422 AT LVLKWLAFG GE SVS DH VK 2441 

7715 CAG GCG GCA GTT GAT TTA GTG GTC TAT TAT GTG ATC AAT AAC CCT TCC TTC CCA GGT GAC 7774 

2442 QAAVDLVVYYVMNKPSFPCD 2461 

7775 TCC GAG ACA CAG CAA GAA GGG AGG CGA TTC GTC GCA AGC CTG TTC ATC TCC GCA CTG GCA 7834 

2462 SETQQEGRRFVASLFXSALA 2481 

7835 ACC TAC ACA TAC AAA ACT TGG AAT TAC CAC AAT CTC TCT AAA GTG GTG GAA CCA GCC CTG 7894 tT> 

2482 TYTYKTWNYHNLSKVVE PAL. 2501 ^ 

7895 GCT TAC CTC CCC TAT GCT ACC AGC GCA TTA AAA ATC TTC ACC CCA ACG COG CTG GAG AGC 7954 ™* 

2502 AYLPYATSALKMFTPTRLES 2521 

7955 GTG GTG ATA CTG AGC ACC ACG ATA TAT AAA ACA TAC CTC TCT ATA AGG AAG GGG AAG ACT 8014 



2522 VV I LSTTIYKTYLSI RKGKS 2541 

8015 GAT GGA TTG CTG GGT ACG GGG ATA ACT GCA GCC ATG GAA ATC CTG TCA CAA AAC CCA GTA 8074 {T. 

2542 OGLLGTGZSAANC ILSQNPV 2561 

8075 TCG GTA GGT ATA TCT GTG ATG TTG GGG GTA GGG GCA ATC GCT GCG CAC AAC GCT ATT GAG 8134 

2562 SVGISVMLGVGAI AAHNAI E 2581 

8135 TCC ACT GAA CAG AAA AGG ACC CTA CTT ATG AAG GTC TTT CTA AAG AAC TTC TTG GAT CAG 8194 

2582 SSEQKRTLLMKVFVKNFLDQ 2601 

8195 GCT GCA ACA GAT GAG CTG GTA AAA GAA AAC CCA GAA AAA ATT ATA ATG GCC TTA TTT GAA 8254 

2602 AATDELVKENPEK I XMALFE 2621 

8255 GCA GTC CAG ACA ATT GGT AAC CCC CTG AGA CTA ATA TAC CAC CTG TAT GGG GTT TAC TAC 8314 

2622 AVQTIGNPLRLIYHLYGVYY 2641 

8315 AAA GGT TGG GAG GCC AAG GAA CTA TCT GAG AGG ACA GCA GCC AGA AAC TTA TTC ACA TTG 8374 

2642 KGWEAKELSERTAGRNLFTL 2661 

8375 ATA ATG TTT GAA GCC TTC GAG TTA TTA GGG ATG GAC TCA CAA GGG AAA ATA AGG AAC CTG 8434 

2662 1MFEAFELLGMDSQGKI RNL 2681 

8435 TCC GGA AAT TAC ATT TTG GAT TTG ATA TAC GGC CTA CAC AAG CAA ATC AAC AGA GGG CTG 8494 

2682 SGNYILDL1YGLHKQINRCL 2701 

8495 AAC AAA ATG GTA CTG GGG TGG GCC CCT GCA CCC TTT ACT TGT GAC TGG ACC CCT AGT GAC 8554 

2702 KKMVLGWAPAPFSCDWTPSD 2721 

8555 GAG AGG ATC AGA TTG CCA ACA GAC AAC TAT TTG AGG GTA GAA ACC AGG TCC CCA TGT GGC 8614 

2722 ER I RLPTDNYLRVETRC PCG 2741 

8615 TAT GAG ATG AAA GCT TTC AAA AAT GTA GOT GGC AAA CTT ACC AAA GTG GAG GAG AGC GGG 8674 

2742 YEMKAFKNVGGKLTKVEESG 2761 

8675 CCT TTC CTA TGT AGA AAC AGA CCT GGT AGG GGA CCA GTC AAC TAC AGA GTC ACC AAG TAT 8734 

2762 PFLCRNRPGRGPVNYRVTKY 2781 
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8735 TAC GAT GAC AAC CTC AGA GAG ATA AAA CCA GTA CCA AAG 
2782 YDDNL RE I KPVAK 



TTC CAA 
L E 



GGA CAG GTA GAC CAC 6794 
C 0 V E H 2601 



8795 TAC TAC AAA GGG CTC ACA CCA AAA ATT GAC TAC ACT AAA 
2802 YYKGVTAKIDYSK 



GGA AAA 

G K 



ATG CTC TTG GCC ACT 
M L L A T 



8854 
2821 



8855 GAC AAG TCG GAG GTG GAA CAT OGT GTC ATA ACC ACC TTA 
2822 DKWEVEHGVITRL 



GOT AAG 
A K 



AGA TAT ACT GGG GTC 8914 
R Y T G V 2841 



8915 OCXS TTC AAT GGT GCA TAC TTA OCT CAC CAG CCC AAT CAC 
2842 GFNGA Y LGDEPNH 



CCT OCT 
R A 



CTA GTG GAG AGG GAC 
L V E R D 



B974 
2861 



8975 TCT GCA ACT ATA ACC AAA AAC ACA GTA CAG TTT CTA AAA 
2862 CATI T KNTVQFLK 



ATG AAG 
M K 



AAG GGG TCT CCC TTC 9034 
K G C A F 2861 



9035 ACC TAT GAC CTC ACC ATC TCC AAT CTG ACC ACC CTC ATC 
2882 TYDLT I SNLTRLI 



GAA CTA 

E L 



GTA CAC AGG AAC AAT 9094 
V H R N N 2901 



9095 CTT GAA GAG AAG GAA ATA CCC ACC GCT ACC GTC ACC ACA 
2902 L E £ K E I PTATVTT 



TCG CTA 
W L 



GCT TAC ACC TTC GTC 9154 
A Y T F V 2921 



9155 AAT GAA GAC GTA GGG ACT ATA AAA CCA CTA CTA GGA GAG 
2922 NEDVGTI KPVLGE 



AGA GTA 
R V 



ATC CCC GAC CCT GTA 
I P D P V 



9214 
2941 



9215 GTT GAT ATC AAT TTA CAA CCA GAG GTG CAA CTG GAC ACC 
2942 VDINLQPEVQVDT 



TCA GAG 
S E 



GTT GGG ATC ACA ATA 
V G I T I 



9274 
2961 



9275 ATT GCA ACC GAA ACC CTG ATG ACA ACC GCA GTG ACA CCT 
2962 1CRET LHTTGVTP 



GTC TTG 
V L 



CAA AAA GTA CAG CCT 9334 
E K V E P 2981 



9335 GAC GCC ACC GAC AAC CAA AAC TCC GTC AAG ATC COG TTC 
2982 DASDNONSVKIGL 



GAT GAG 
D E 



GGT AAT TAC CCA CCC 
C N Y P G 



9394 
3001 



939S CCT GGA ATA CAC ACA CAT ACA CTA ACA GAA GAA ATA CAC 
3002 PCIOTHTLTEEIH 



AAC AGG 
N R 



GAT GCC AGG CCC TTC 9454 
D A R P F 3021 



9455 ATC ATC ATC CTC GCC TCA ACC AAT TCC ATA TCA AAT AGG 
3022 INI LGSRNSISNR 



GCA AAC 
A K 



ACT GCT ACA AAT ATA 9514 
T A R N I 3041 



9515 AAT CTG TAC ACA GCA AAT GAC CCC AGG GAA ATA CCA GAC TTC ATG 
3042 NLYTGNDPREIRDLM 

9575 TTA GTA CTA CCA CTC ACC CAT GTC GAC CCT GAG CTG TCT GAA ATC 
3062 LVVALR DVDPELSEM 



CCT GCA GCC CCC ATG 9574 
A A G R M 3061 



GTC GAT TTC AAC GGG 9634 
V D F K G 3061 



9635 ACT TTT TTA CAT AGG GAG GCC CTG GAG GCT CTA ACT CTC GGG CAA 
3082 TFLDREALEALSLGQ 

9695 GTT ACC AAG GAA GCT GTT AGG AAT TTG ATA GAA CAG AAA AAA GAT 
3102 VTKEAVP. NLIE0KKD 



CCT AAA CCC AAG CAG 9694 
P K P K Q 3101 



GTC GAG ATC CCT AAC 9754 
V E I P N 3121 



O 



9755 TGG TTT GCA TCA GAT GAC CCA GTA TTT CTG GAA GTG GCC 
3122 WFASODPVFLEVA 



TTA AAA 

L K 



AAT GAT AAG TAC TAC 
N D K Y Y 



9814 
3141 



9815 TTA GTA GGA GAT GTT OCA GAG CTA AAA GAT CAA GCT AAA GCA CTT 
3142 LVGDVG ELKDQAKAL 

9875 ACA ACA ATT ATA AAC GAG GTA GGC TCA AGG ACC TAT GCC ATG AAG 
3162 TRIIKEVGSRTYAMK 

9935 CTC AAC GCA TCA AAC AAA CAG ATG ACT TTA ACT CCA CTC TTT GAG 
3182 LKASNKOMSLTPLFE 



GGC GCC ACC GAT CAG 9874 

G A T D Q 3161 

CTA TCT ACC TCG TTC 9934 

L S S W F 3181 

GAA TTG TTC CTA GGC 9994 

E L L L R 3201 



9995 TCC CCA CCT GCA ACT AAG ACC AAT AAG GGC CAC ATG GCA 
3202 CPPATK SNKGHMA 



TCA GCT 
S A 



TAC CAA TTC GCA CAG 
Y Q L A 0 



10054 
3221 



10055 GGT AAC TGG GAG CCC CTC GGT TCC GGG GTG CAC CTA GGT 
3222 GNWEPLGCGVHLG 



ACA ATA 
T I 



CCA GCC AGA AGG GTG 10114 
P A R R V 3241 



10115 AAG ATA CAC CCA TAT GAA GCT TAC CTG AAG TTC AAA GAT TTC ATA 
3242 KIHPY EAY LKLKDFI 

10175 AAA CCT AGG GTT AAG GAT ACA GTA ATA AGA GAC CAC AAC AAA TGG 
3262 KPRVKDTVIREHNKW 

10235 AGG TTT CAA GGA AAC CTC AAC ACC AAG AAA ATG CTC AAC CCC GGG 
3282 RFQGN LNTKKMLNPG 

10295 TTC GAC AGG CAC GGG CGC AAG AGG AAC ATC TAC AAC CAC CAG ATT 
3302 LDREGRKRNIYNHQI 

10355 ACT CCA GGC ATA AGG CTG GAG AAA TTC CCA ATA GTG AGG GCC CAA 
3322 SAGIR LEKLP1VRAQ 

10415 TTT CAT GAG GCA ATA AGA CAT AAG ATA GAC AAG ACT GAA AAC CGC 
3342 FHEAI RDKIDKSENR 



GAA GAA GAA GAC AAC 10174 

E E E E K 3261 

ATA CTT AAA AAA ATA 10234 

I L K K I 3281 

AAA CTA TCT GAA CAG 10294 

K L S E 0 3301 

GGT ACT ATA ATG TCA 10354 

G T I M S 3321 

ACC GAC ACC AAA ACC 10414 

T D T K T 3341 

CAA AAT CCA GAA TTG 10474 

Q N P E L 3361 
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10475 CAC AAC AAA TTG TTC CAG ATT TTC CAC ACC ATA GCC CAA CCC ACC CTC AAA CAC ACC TAC 10534 
3362 HNKLLEIFHTIAQPTLKHTY 3381 

10535 OCT GAG GTG ACG TGG GAG CAA CTT GAG GCG GGC ATA AAT ACA AAG GOG OCA GCA GGC TTC 10594 

3382 GEVTWEQLEAGI NRKGAAGF 3401 

10595 CTG GAG AAG AAG AAC ATC GGA GAA CTA TTG GAT TCA GAA AAG CAC CTG GTA GAA CAA TTG 10654 

3402 LEKKNIGEVLDSEKHLVEQL 3421 

10655 GTC AGG GAT CTG AAG GCC GGG ACA AAG ATA AAA TAT TAT GAA ACT GCA ATA CCA AAA AAT 10714 

3422 VRDLKAGRK IKYYETAI PKN 3441 

10715 GAG AAG AGA GAT GTC ACT GAT GAC TGG CAG GCA GGG GAC CTG GTG GTT GAG AAG AGG CCA 10774 

3442 EKRDVSDDWQAGDLVVEKRP 3461 

10775 AGA GTT ATC CAA TAC CCT GAA GCC AAG ACA AGG CTA GCC ATC ACT AAG GTC ATG TAT AAC 10834 

3462 RVIOYPEAKTRLAITKVMYN 3481 

10835 TGG GTG AAA CAG CAG CCC GTT GTG ATT CCA GGA TAT GAA GGA AAG ACC CCC TTG TTC AAC 10894 

3482 WVKQQPVVI PGYEGKTPLFN 3501 

10895 ATC TTT GAT AAA GTG AGA AAG GAA TGG GAC TOG TTC AAT GAG CCA GTG GCC GTA AGT TIT 10954 

3502 IFDKVRKEWDSFNEPVAVSF 3521 

10955 GAC ACC AAA GCC TGG GAC ACT CAA GTG ACT AGT AAG GAT CTG CAA CTT ATT GGA GAA ATC 11014 

3522 DTKAWDTOVTSKDLQLIGEI 3541 

110X5 CAG AAA TAT TAC TAT AAG AAG GAG TGG CAC AAG TTC ATT GAC ACC ATC ACC GAC CAC ATG 11074 

3542 OKY YYKJCEWHKF I DT ITDHK 3561 

11075 ACA GAA GTA OCA GTT ATA ACA GCA OAT GOT GAA GTA TAT ATA AGA AAT GGG CAC AGA GGC 11134 

3562 TEV PVI TADGEVYI R N G Q R G 3581 

11135 AGC GCC CAG CCA GAC ACA AGT GOT GCC AAC ACC ATG TTA AAT GTC CTG ACA ATG ATG TAC 11194 

3582 SGQ P DTSAGNSMLNV LTMMY 3601 

11195 GCC TTC TGC GAA AGC ACA GGG GTA COG TAC AAG AGT TTC AAC AGG GTG GCA AGG ATC CAC 11254 

3602 GFCESTGVPYKSFNRVARIH 3621 

11255 GTC TGT GGC GAT GAT GGC TTC TTA ATA ACT GAA AAA GGC TTA CCC CTG AAA TTT OCT AAC 11314 

3622 V C G DDG F L I TEKG L G L K FAN 3641 

11315 AAA GCG ATG CAG ATT CTT CAT CAA GCA GGC AAA CCT CAG AAG ATA ACG GAA GGG GAA AAG 11374 

3642 KGMQILHEA. GKPQXI TEGEK 3661 

11375 ATG AAA GTT CCC TAT AGA TTT GAG GAT ATA GAG TTC TGT TCT CAT ACC CCA CTC CCT GTT 11434 

3662 MKVAYRFED I EFCSHTPVPV 3681 

11435 AGG TGG TCC GAC AAC ACC AGT AGT CAC ATG GCC GCG AGA GAC ACC GCT GTG ATA CTA TCA 11494 

3682 R WSDNTSSHMACRDTAVI LS 3701 

11495 AAG ATG GCA ACA AGA TTG GAT TCA AGT OCA GAG AGG GGT ACC ACA GCA TAT GAA AAA GCG 11554 

3702 KMATRLDSSGERGTTAYEKA 3721 

11555 GTA GCC TTC AGT TTC TTC CTG ATG TAT TCC TOG AAC COG CTT GTT AGG AGC ATT TGC CTG 11614 

3722 VAFSFLLMYSWNPLVRRICL 3741 

11615 TTG GTC CTT TOG CAA CAG CCA GAG ACA GAC CCA TCA AAA CAT GCC ACT TAT TAT TAC AAA 11674 

3742 LVLSQQPETDPSKHATYYYK 3761 

11675 CCT GAT CCA ATA GOG CCC TAT AAA GAT GTA ATA GOT COO AAT CTA AGT GAA CTG AAG AGA 11734 

3762 GDP IGAYKDVTGRNLSELKR 3781 

11735 ACA GGC TTT GAG AAA TTG GCA AAT CTA AAC CTA AGC CTG TCC ACC TTG GGG ATC TGC ACT 11794 

3782 TGFEKLANLNLSLSTLGIWT 3801 

11795 AAG CAC ACA AGC AAA AGA ATA ATT CAG GAC TGT GTT GCC ATT GCG AAA GAA GAC GGC AAC 11854 

3802 KHTSKR I IQDCVAIGKEEGN 3821 

11855 TGG CTA GTT AAC GCC GAC AGG CTG ATA TCC AGC AAA ACT GGC CAC TTA TAC ATA CCT GAT 11914 

3822 WLVNADRLI SSKTGH LYI PD 3841 

11915 AAA CCC TTT ACA TTA CAA GGA AAG CAT TAT GAG CAA CTC CAG CTA AGA ACA GAG ACA AAC 11974 

3842 KCFTLQGKHYEQLQLRTETN 3861 

11975 COG GTC ATG GGG GTT GGG ACT GAG AGA TAC AAG TTA GGT CCC ATA GTC AAT CTC CTG CTG 12034 

3862 PVMCVCTERYKLGPIVNLLL 3881 

12035 AGA AGG TTG AAA ATT CTG CTC ATG ACC GCC GTC GCC GTC AGC AGC TGA gacaaaatgtatatat 12098 

3882 RRLKI LLMTAVGVSS • 3897 

12099 tgtaaataaattaatccatgcacacagcgtatacaaatatagccgggaccgtccacctcaagaagacgacacgcccaaca 12178 

12179 cgcacagccaaacagcagccaagactacctaccccaagataacaccacacctaacgcacacagcaccctagccgLacgag 12258 

12259 gatacgcccgacgtctacagttggactagggaagacctctaacagccccc 12308 
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GTATaatcactccxctgtgaggaactactgtcttcac^ 
gaccccccctcccgggagagccatagtggtctgcggaaccggt^^ 
aacecgctcaatgcxtggagatttgggcgtgcccccgc^ 
ctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 13 
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GTaatcactcccctgtgaggaactactgtcttcacgcagaaagcgtctagcc^ 
cccccctcccgggagagcxatagjggtctgcgga^ 
ccgctcaatgcctggagatttgggcgtgcccccgc^ 
atagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 14 
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GTATacactccaccatgaatcactcccctgtgaggaactactgtcttcacgcaga 
tgcagcctccaggaccccccctcccgggagagccatagtggtctgcggaarc^ 
tcctttcttggataaacccgctcaatgcctggagamgggcgtgcccccgcaagacigctagccga^ 
cttgtggtaagcctgatagggtgcttgcgagtgccccgggaggtctcgtagaccgtgcaccATG 



FIGURE 15 
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GTATCAGAAGTGCGAATGCTGAacactccaccatgaatcactcccctgtgaggaactart 
gcgtctagccatggcgttagtatgagtgtcgtg^ 

agtacaccggaattgccaggacgaccgggtcctttcttggataaacccgctcaatgcctggagatitggg 

cmgccgagtagtgttgggtcgcgaaaggccttgtggtactgcctg 

caccATG 



FIGURE 16 
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GTATgccagc^ctgalgggggcgacactccaccatgaau:acuxcctglgaggaaciac 

ccatggcgttagtatgagtgtegtgcagcctccaggrc 

ggaattgccaggacgaccgggtcctttcttggataaacc^ 



G 



FIGURE 17 
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GTATTGCAGll'lgccagcccxctgatgggggcgacactccaccatgaatcactcccctgtgaggaac 

agaaagcgtcmgccatggcgttagtatgagtgtcgtgcagcctccaggaccccccctcccgggagagccatagtggta 

cggtgagtacaccggaattgccaggacgaccgggtcctttcttgg 

gactgctagccgagtagtgttgggtcgcgaaaggccngtggtaagcctgaiagggtgcttgcgagtgccccgggaggtctc 
ccgtgcaccATG 



FIGURE 18 
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CTATTGCAGTTTgccagccccctgatgggggcgacrc 

agaaagcgtctagcxatggcgttagutgagtgtcgtgcagcctc^ 

cggtgagtacaccggaattgcxaggacgaccgggtcctltcttggataaacccgctc^ 

gactgctagccgagtagtgttggpcgcgaaaggccttg^^ 

ccgtgcaccATGGAGTTGATCAC\^ 

CCGTCGGGGTGGAGGAACXrrGTTTATGATCAGGCAGGTGATCCXnTATTTGGT 

GAAAGGGGAGCAGTCCACCCTCAATCXjACGCTAAAG 

GGGAACGCGATGTTCCAACCAACTTGGCATCCTTACCAAAA^ 

AGGTCGGGTAATAGCAGAGGACCTGTGAGCGGGATCTACCTGAAGCCAGGGC 

CACTATTTTACCAGGACrATAAAGGTCCCGTCTATCACA 

TCTTTGAGGAGGGATCCATGTGTGAAACGACrAAACGGATAGGGAGAGTAACr 
GGAAGTGACGGAAAGCTGTACCACATTTATGT(JIX^ 
ATAAAAAGTGCCACGAGAAGTTACCAAAGGGTGTTCAGGTGGGTCX 
GCTTGACTGCCCTCTATGGGT^ 

CAACAAAAAAGAAAACACAGAAACCCGACAGACTAGAAA^^ 
AATAGTGCCCAAAGAATCTCAAAAAGACAGCAAAACrA^ 
CAATAGTGGTGGAAGGAGTCAAATACCAGGTGAGGAAGAAGGGAAAAACCAA 
GAGTAAAAACACTCAGGACXIKXnTGTACCATAACA 

CACGCAAGAAACTGGAAAAAGCATTGTTGGCGTGGGCAATAATAGCTATAG^ 
TTGTTTCAAGTTACAATGGGAGAAAACATAAC^ 

GGGACGGAAGGGATACAACGGGCAATGTTCCAAAGGGGTGTGAATAGAAGTT ~ 
TACATGGAATCTCKjCCAGAGAAAATCTGTACTGGTGTCCCnTCCCATCT £ 
CXX3ATATAGAACTAAAAACAATTCATGGTATGATGGATGCAAGTGAGAAGACC ^ 
AACTACACGTGTTGCAGACTTCAACGCCATGAGTGGAACAAGCATGGTTGGTG g 
CAACTGGTACAATATTGAACCCTGGATTCTAGTCATC 3 
TCTCACTGAGGGACAACCACCAAGGGAGTGCGCAGTCACTTGTAGGTATGATA O 
GGGCTAGTGACTTAAACGTGGTAACACAAGCTAGAGATAGCCCCACACCOTA £ 
ACAGGTTGCAAGAAAGGAAAGAACTTCTCCTTTGCAG 
rcCCTGC AACITTGAAATAGCTTC 

CATTAGTATGTTCCAGGATACTACTC1TTACCTTGTTGACGGGTTO 

TTAGAAGGTGCCAGACAAGGAACCGCTAAACrGACAACCTGGTTAGGCAAGCA 

GCTCGGGATACTAGGAAAAAAGTTGGAAAACAAGAGTAAGACGTGGTTTGGAG 

CATACGCTGCnTCCCXTrACTGTGATGTCGATCGCAAAATTGGCT 

ATAQ\AAAAATTGCACCCCTGCCITXnTAOCCAAG 

CIXjGGAAATTTGACACCAATGCAGAGGACXjGCAAGATATTACATGAGATGGGG 

GGTCACTTGTCXKJAGGTACTACTACTTrCTTTAGTGGTGCT 

CCXXJAAACAGCTAGTCTAATGTACCTAATCXZTACATIT^ 

ACGTTGATGTAATGGATTGTGATAAGACCCAGTTGAACXTCACAGTGGAGCTG 

ACAACAGCTGAAGTAATACCAGGGTCGGTCTGGAATCTAGGCAAATATGTATG 

TATAAGACCAAATTGGTGGCCTTATGAGACAACTGTAGTGTTGGCATTTGAAGA 

GGTGAGCCAGGTGGTGAAGTTAGTGTTGAGGGCACTCAGAGATTTAACACGCA 

TTTGG AACGCTGC AACAACT A CTGC ' ACT ATGCCTTGTT AAG AT AGTC AG 

GGGCCAGATGGTACAGGGCATTCTGTGGCTACTATTGATAACAGGGGTACAAG 

GGCACTTGGATTGCAAACCTGAATTCTCGTATGCCATAGCAAAGGACGAAAGA 

ATTGGTCAACTGGGGGCTGAAGGCCrTACCACCACTTGGAAGGAATACTCACC 

TGGAATGAAGCIXK3AAGACACAATGGTCATTGCTTGGTGCGAAGATGGGAAGT 

TAATCTACCTCCAAAGATGCACGAGAGAAACCAGCTATCTCGCAATCTTGCATA 

CAAGAGCCTTGCCGACCAGTCTGGTATTCAAAAAACTCTTTGATGGGCGAAAG 
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CAAGAGGATGTAGTCGAAATGAACGACAACITTGAATTT 

GATGCCAAACCCATAGTAAGAGGGAAGTTCAATACAACGCTGCTGAACGGACC 

GGCCTTCCAGATGGTATGCCCCATAGGATGGACAGGGACTGTAAGCTGTACGT 

CATTCAATATGGACACCTTAGCCACAACTGTGGTACGGACATATAGAAGGTCTA 

AACCATTCCCrCATAGGCAAGGCTGTATCACCCAAAAGAATCT^ 

CTCCATAACTGCATCCTTGGAGGAAATTGGACTTGTGTGCCT 

CTATACAAAGGGGGCTCTATTGAATCTTGCAAGTGGTGTGGCr 

GAGAGTGAGGGACTACXIACACTACCCCATTGGCAAGTGTAAATTGGAGAACGA 

GACTGGTTACAGGCTAGTAGACAGTACCTCTTGCAATAGAGAAGGTGTGGCCA 

TAGTACCACAAGGGACATTAAAGTGCAAGATAGGAAAAACAACTGTACAGGTC 

ATAGCTATGGATACCAAACTCGGACCTATGCCTTGCAGACX 

TCAAGTGAGGGGCCn XjTAG AAAAGACAGCGTGTACTTTCAACTACACTAA 

ATTAAAAAATAAGTATTTTGAGCCXZAGAGACAGCT 

AAAAGGAGAGTATCAATACTCGTTTGACC^^ 

ATTACTTQjCTGAGTCCATATTAGTGGTGGTAGTAGCCCTCTTG 

ATGTACTTTGGTTACTGGTTACATACATGGTCTTATCAGAACAGAAGGC 

GGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAACTTGCrAACCCAT 

AACAATATTGAAGTGGTGACATACTTCTTGCTGCTGTACCTACT 

GAGAGCGTAAAGAAGTGGGTCTTACTCTTATACCACAT^ 

ATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGTGGTAAAGGCCG 

TCAGGGGGCCAAG AGTAC TTGGGGAAAATAGACCTCTGTTTTACAACAGTAGT 

ACTAATCGTCATAGGTITAATCATAGCCA 

GGTAACAATAATGGCAGCACTGAGGGTCAC^ 

TTGACATCGCTGTGGCGGTCATGACTATAACCXJrACT^ 

CAGATTATTTTAGATATAAAAAATGGTTACAGTGCATTCT^ 

GGTGTTCTTGATAAGAAGCCTAATATA CCTA GGTAGAATCGAGATGC^GAGG 

TAACTATCXZCAAACTXKjAGACCACTAACT^ 

AACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCXJ^ 

TGCCTATCITATTGCTGGTCACAACXnTGTGGGCCXj 

GATCXZTGCCTACCTATGAATTGGTTAAATTATACTATCTX3 

GATATAGAAAGAAGTTGGCTAGGGGGGATAGACTATACAAGAGTTGACTCCAT 
CTTACGACGTTGAT GAGAG TGGAGAGGGCCrrATATCTTTTTCCATCAAGGCAGA 
AAGCACAGGGGAATTrrrCTATACTCTTGCC(XTTATC 
GTTGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTTACTT^ 

ttatgtactacatgcacagkjaaagttatagaagagatctciaggaggtaccaa^ 
taatat(xl\ggttagtggcagcacix:atagagctoaact^ 

GAGGAGAGCAAAGGCTTAAAGAAGTTITATCTATTGTCTXKjAAGGT^ 

cchvvataataaaacataaggtaaggaatgagacot^ 
aggaggaagtctacggtatgccaaagatcatgactataatca^ 

CIXjAGTAAGAGCAGGCACTGCATAATATGCACTGTATCTGAGGGCCGAGAGTG 

gaaaggtggcacctgc ccaaa atgtggacgccatgggaagccgataacg 

gggatgtcxkttagcagattttg 

gaaggcaactttgagggtatgtgcagcc^^ 

TTGAAATGGACCXKKjAACCTAAG 

ctgcatcctgctgaggaaggtgacttttgggcagagtcgagcatgttgggcct 
caaaatcacctactttgcgctgatggatggaaagc^^ 

GGCTCKjATGCCAGCGTGTCKjGAATCTCCCCAGATACCCACAGAGTCCCTTGTC 

ACATCrcATTTGGTTCAGjGATGCCTTTCAGGCAGGA^ 

AATATACCGCTAGGGGGCAACTATTTCKjAGAAACITGCCCXj^ 

AAGT AAAAATGCTC ATGG f AGGCAACCTTGG AG AAG AAATTGGT AATCTGG AA 

CATC1TGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAAGATCACAGAGCA 

CGAAAAATGCCACATTAATATACTGGATA/^ 

GCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTACXjAGCTTACTAA 
AAGTGAGGAGGGGTCTGGAGACTGCCTGGGCTTACACACACCAAGGCGGGAT 
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AAGTTCAGTCGACCATGTAACCGCCGGAAAAGATCTACTGGTCTGTGACAGCA 

TGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGAOCGATGAGACA 

GAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGCCAGATGTTATGT 

GTTAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGGCAGTCGTTCACC 

TCCAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCACACCGGCr 

TTCTTCXjACCTAAAAAACTTGAAAGGATGGTCAGGCTTGCCTATATTTGAAGCC 

tccagcgggagggtggttggcagagtcaaagtagggaagaatgaagagtcta 

aacctacaaaaataatgagtggaatccagaccgtctcaaaaaacagagcagac 

ctgac cx3ag atggtcaagaagataaccagcatgaacaggggagacttcaagca 

gattactttggcaacaggggcaggcaaaacx:acagaactcccaaaagcagtta 

tagaggagataggaagacacaagagagtattagttcttatacxzattaagggca 

GCGGCAGAGTCAGTCTACXIAGTATATGAGATTGAAACACCCAAGCATCTCTTTT 

AACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCXjGGATAACCT 

ATGCATCATACGGGTACITCTGCCAAATGCXnxrAACCAAAGCTCAGAGCIXjCT 

TGGTAGAATACTCATACATATTCnTAGATGAATACXZATTGTGCCACTCCTGAACA 

ACTGGCAATTATCGGGAAGATCCACAGATTTTCAGAGAGTATAAGGGTTGTCXj 

CCATGACTGCCACGCCAGCAGGGTCXjGTGACXrACAACAGGTCAAAAGCACCCA 

ATAGAGGAATTCATAGCCXX:CGAGGTAATGAAAGGGGAGGATCTTGGTAGTCA 

GTTC CTTGATAT AGCAGGGTTAAAAATACCAGTGGATGAGATGAAAGGCAATAT 

GTTGGI 1 1 riGTACCAACGAGAAACATGGCAGTAGAGGTAGCAAAGAAGCTAA 

AAGCTAAGGGCTATAACTClXXxATACTATTACAGTGGAGAGGATCCAGCCAAT 

CTOAGAGTTGTGACATCACAATCCCCXTATGTAATCGTG 

GAATCAGGAGTGACACTACCAGATTTGGACACXX7rTATAGACAOGGGGTTGAA 
ATGTGAAAAGAGGGTGAGGGTATCATCAAAGATAC(XTTCATCGTAACAGGCC 
TTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGGCAGAGT 
AGGTAGAGTGAAACXXXKXjAGGTATTATAGGAGCCAGGAAACAGCAACAGGG 

tcaaaggactaccactatgacctcttgcaggcacaaagatacgggattgagga <? 
tggaatcaacgtgacgaaatcctttagggagatgaattacgattggagcctata 2 

CGAGGAGGAC^GCCTACTAATAACCCAGCIXXjAAATACTAAATAATCTACTCAT cd 

CTCAGAAGACTTGXX^GCXX3CTGTTAAGAACATAATG!GCC^GGACTGATCACC 2 

CAGAGCCAATCCAACTTXjCATACAACAGCTATGAAGTCCAGGTCCCGGTCCTGT ^ 

TCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACGAAAATTACTCXTITrc S2 

TAAATGCCAGAAAGTTAGGGGAGGATGTGCXXXTIXjrATATCTACGCTACTGAA b 
GATGAGGATCTGGCAGTTGACCTCrrAGGGCTAGACTGGCXTGATCCTGGGAA 
CCAGCAGGTAGTGGAGACTGKjTAAAGC^CTGAAGCAAGTGACCGGGTTGTCCT 

cggctgaaaatgccctactagtggctttatttgggtatgtgggttaccaggctc 

tctcaaagaggcatgtccxiaatgataacagacatatatacxiatcgaggagcaga 

gactagaagacaccacc^cctc<^gtatgcacccaacgccataaaaaccgat 

gggacagagactgaactgaaagaactggcgtcgggtgacgtggaaaaaatca 

tgggagccatttcagattatgcagctgggggactggagtttgttaaatccc^ 

gcag aaaag at aaaaacagctcc 1 11 u i ' 1 ' i aaagaaaacgcag aagccgcaaa 

agggtatg tcca aaaattcattgactcattaattgaaaataaa 

cagatatggtttgtggggaacacacacagcactatacaaaagcatagctgcaa 

gactggggcatgaaacagcgtttgccacactagtgttaaagtggctagcttt^ 

ggaggggaatcagtgtcagaccacgtcaagcaggcggcagttgatttagtgg 

tctattatgtgatgaataagccttccttcccaggtgactccgagacacagcaag 

AAGGGAGGOjATTCGTCGCAAGCCTGTTCATCTCCGCACTGGCAACCTACACA 

TACAAAACTTGGAATTACCACAATCTCTCTAAAGTGGTGGAACCAGCXXTGGCT 

TACXrrCCCCTATGCTACCAGCGCATTAAAAATGTTCACCCCAACGCGGCTGGAG 

AGCGTGGTGATACTGAGCACXTACGATATATAAAACATACCTCTCTATAAGGAAG 

GGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGCAGCCATGGAAATCC 

TGTCACAAAACCCAGTATCGCTAGGTATATCTGTGATGTTGGGGGTAGGGGCA 

ATCGCTGCGCACAACGCTATTGAGTCCAGTCAACAGAAAAGGACCCTACTTAT 

GAAGGTGTTTGTAAAGAACTTCTTGGATCAGGCTGCAACAGATGAGCTGGTAA 
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AAGAAAACCCAGAAAAAA^ 

GTAACCCCCIX}AGACTAATATACCACCTGTATGGGGTTTACT 

AGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACTTATTCACATTG^ 

ATGTTTGAAGCCTTCX5AGTTATTAGGGATGGACTCACAAGGGAAAATAAGG^ 

CCTGTCCGGAAATTACATTTn3GATTTGATATACGGCCTA 

CAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGCACCCTTTAGT^ 

ACrTCGACCCCTAGTGACGAGAGGATCAGATTGCCAACAGACAACT 

GTAGAAACCAGGTGCCCATGTGGCTATGAGATGAAAGKJITTCAAAAATGTAGG 

TGGCAAACnTACCAAAGTGGAGGAGAGCGGGCCTTTCCTATCT 

CTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATTACGATGACAACCTC 

AGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTAGAGCACTACTACAA 

AGGGGTCAGAGCAAAAATTGACTACAGTAAAGGAAAAATGCT 

AGAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAGCT 

GGGGTCGGGTTCAATGGTGCATACITAGGTGACGAGCCCAATCACCGTGCTCr 

AGTGGAGAGGGACTGTGCAACTATAACXrAAAAACACAGTACAGTTTCTAAAAAT 

GAAGAAGGGGTGTCCXTITCACXJrATGACCro 

TC^TCGAACTAGTACACAGGAACAATCrTGAAGAGAAGGAAATACXXZAGCGCr 

ACGGTCACXL\CATGGCTAGCnTACACCTTTOTGAATGAAGACGTA 

AAAACCAGTACTAGGAGAGAGAGTAATCCCCGACXXn'GTAGTTGATATCAATTT 

ACAACCAGAGGTGCAAGTGGACACGTCAGACKnTGGGATC 

GGGAAACCCTGATGACAACXK3GAGTGACACCTGTCTTGGAAAAAGTAGAGCCT 

GACXK!CAGCGACAACCAAAACTCGGTGAAGATCGGGTTGGATGAGGGTAATTA 

CXXZAGGGCCTtKjAATACAGACACATACACTAACAGAAGAAATACACAACAGGG 

ATGCGAGGCCCTTCATCATGATCXnXXK^ 

CAAAGACIXXJTAGAAATATAAATCTGTACACAGGAAATGACXXXIAGGGAAATA 

CGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCACTGAGGGATGTCGA 

CCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACT 

CCCTGGAGGCTCTAAGTCTCGGGCAACXTAAACCGAAGCAG 

GCTGTTAGGAATTroATAGAACAGAAAAAAGATGTGGAGATCCXT 

GCATCAGATGACCCAGTATTTCTGGAAGTGGCCTTAAAAAATGAT^ 

TTAGTAGGAGATGTTGGAGAGCTAAAAGATCAAGCrAAAGCACIT^ 

GGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGACGTATGCCATGAAGC 

TATCTAGCTGGITCCTCAAGGCATCAAACAAACAGATGAGTTTAACT^ 

TTGAGGAATTGTTGCTACGGTGCCCACXnXKZAACT 

ATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCCCCTCGCnTGCGG 
GGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGATACACCCATATGAAG 
CTTACXZTGAAGTTGAAAGATTTCATAGA^ 

AAGGATACAGTAATAAGAGAGCAC^AC^AATGGATACTTAAAAAAATAAGGTTT 
CAAGGAAACCTCAACACCAAGAAAATGCTCAACXX 

GTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATTGGTACT 
ATAATGTCAAGTGCA GGCA TAAGGCTGGAGAAATTGCCAATAGTGAGGGCCCA 
AACCGACACCAAAACCTTTCATGACKjCAATAAGAGATAAGATAGACAAGAGTG 
AAAACCGGCAAAATCCAGAATTGCACAACAAAT^ 

TAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGACGTGGGAGCAACTT 

GAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTCCTGGAGAAGAAGAACA 

TCGGAGAAGTATTGGATTCAGAAAAGCAOCTGGTAGAACAATTGGTCAGGGAT 

CTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTCCAATACCAAAAAATGA 

GAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACCTGGTGGTTGAGAAG 

AGGC£AAGAGTTATCCAATA(XCTGAAGCCAAGACAAGG 

GGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTGATTCCAGGATATGAAG 

G AAAG ACCCCU 11 U 1 4 C AAC ATCTTTG ATAAAGTG AG AAAGG AATGGG ACTCGT 

TGAATGAGCCAGTGGCCGTAAGTTTTGACACCAAAGCCTGGGACACTCAAGTG 

ACTAGTAAGGATCTGCAACTTATrGGAGAAATCCAGAAATATTACrATAAGAAG 

GAGTGGCACAAG1TCATTGACACCATCACCGACCACATGACAGAAGTACCAGT 
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TATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGAGAGGGAGCGGC 

CAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCTGACAATGATGTA 

CGGCTTCTGCGAAAGCACAGGGGTACCGTACAAGAGTTTCAACAGGGTGGCAA 

GGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAAAAAGGGTTAGGG 

CTGAAATTTGCTAACAAAGGGATGC^GATTCTrCATGAAGCAGGC 

AAGATAACX3GAAGGGGAAAAGATGAAAGTTGCCTATAGATTTGAGGATATAGA 

GTTCTGTTCTCATAC(XCAGTCXXnXnTAGGTGGTCCGACAACACCAGTAGTCA 

CATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACAAGATTGG 

ATTCAAGTGGAGAGAGGGGTACCACAGCATATGAAAAAGCGGTAGCCTTCAGT 

TTCTrGCTGATGTATTCCTGGAACCCGCTTGTTAGGAGGATTTGCCTGTTGGTC 

CTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCACTTATTATTACAAA 

GGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTCGGAATCTAAGTGAACT 

GAAGAGAACAGGCTITGAGAAATTX3GCAAATCTAAACCTAAGCCTGTCCAOGTT 

GGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTGTGTTGCCA 

TTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCGACAGGCTGATATCCAGC in 

AAAACIXKK^C^CITATACATACCrGATAAAGGCTTrACATTACAAGGAAAGCAT d\ 

TATGAGCAACTGCAGCTAAGAAC«i.GAGACAAACCCGGTCATGGGGGTTGGGA ~ 

CTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAGAAGGTTGAAA H 

ATTCTCKntATGACGGCCGTCGGCGTCAGCAGCTGAag^ g 

gccatttcctgtttttttttttmmantttnt^^ U 

cttccttctttaatggtggctccatctttgccctagtcacggcugctg^ g 

ggcctctctgcagatcatgtCCCCCGGCCGTCGGCGTCAGCTGAgacaaaatgtatatattg ^ 

catgjacatagtgtatalaaatatagttgggaccgtccacc^ 

atctaaacaagataacactacatttaatgcacacagcactttagctgtatgag 

ctaacagcoccc 
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Gtatacgagaattagaaaaggcactcgtatacgtattgggcaattaaaaataataattaggcctaggtacatggcacgtgccagccccct 

g^gggggcgacactccaccatgaatcactc^ 

tgtcgtgcagcctccaggaccccccctcccgggagagra 

cgggtccmcttggataaacccgctcaatgcctggagamgggcgtgcccccgca^ 

aggccrcgtggtactgcctgatagggtgcttgcgagtg^^ 

CTAAACXTCAAAGAAAAACCAAACGTAACACCAACCGTCGCCCACAGGAC^ 

AAGTTCCCGGGTGGCGGTCAGATCXnTGGTGGAGTTTACTTGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACXITCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCT 

AGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACXrCTTACGT 

GCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGT CCGG GTTCTGGAAGACGGCGTGA 

ACTATGCAACAGGGAACCTrCXnX3GTTGCICT 

G CTCT CTTGCCTGACCGTGCCXXKiriTCAGCCTACCA 

GCITTACCATGTCACCAATGATTGC<XTAACTCGAGTATTC 

CGATGCCATCCTGCACACTCXXjGGGTGTGTCCC^^ 

CCTCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 

ACTCCCCACAACXjCAGCTTCGAajI^TATCGATCT 

CCCTCTGCTCG^ 

GTCAACTGTTTACCTTCTCTCCCAGGCGCCAC^ 

GTTCTATCTATCCCGGCCATATAACGGGTCATCGCATGGCATGGGATATGATGA 

TGAACTGGTCCCCTACGGCAGCGTTGGTGGTAGCTCAGCTGCTCCGGATCCCA 

CAAGCGATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT 

AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC 7 

TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG 

CACCACGGCTGGGCITGTTGKjTCTCCTTACACCAGGCGCCAAGCAGAACATCC w 

AACTGATCAAC^CCAACGGCAGTTGGCACATCAATAGCACGGCCTTGAA X 

AATGAAAGOTITAACACCX3GCTGGTTAGCAGGGCT O 

AACTCTTCAGGCTGTCCTGAGAGGTTGGCCAGCTGCCGAC g 

GCCCAGGGCIXKK3GTCCTATCACTITATGCCAA E 

GCC(XTACTGCTGGCACTACCCT(XAAGACCTTGTGGCATTGTGCCCGCAAAG 

AGCGTGTGTGGCCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGG^ 

GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 

GTCTTCGTCCTrAACAACACCAGGCCACCGCTGGGCAATTGGTTCGGT^ 

TGGATGAACTCAACIXKjATTCACCAAAGTGTGCGGAGCG^ 

CGGAGGGGTGGGCAACAACACCnTGCTCTGCCCCACTGAT^ 

ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACA 

TGCATGGTCGACTACCCGTATAGGCrnXKJCACT 

ACCATATTCAAAGTCAGGATGTACGTGGGAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACX}CGGGGCGAACGCTGTGATCrGGAAGACAGGGACAG 

GTC CGAG CTCAGCCCATTGCTGCIXjIX^CACCACACAGTGGCAGGTCCTTCCGT 

GTTCITTCACGACCCTGCCAGCCTTGTCCACCGGCCTCATCCACCTCCAC 

ACATTGTGGACGTGCAGTACTTGTACG^ 

GCCATTAAGTGGGAGTACGTCGTTCTCCTGTTCCTCCTGCTTGCAGACGCGCGC 
GTCTGCTCCTGCTTGTGGATGATGTTACTCATATCCCAAGCGGAGGCGGCTTTG 
GAGAACCTCGTAATACTCAATGCAGCATCCCTGGCCGGGACGCACGGTCTTGT 
GTCCTTCCTCGTGTTCTTCTGCITTGCGTGGTATCTGAAGGGTAGGTGGGTGCC 
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CGGAGCGGTCTACGCCTTCTACGGGAAGTGGGTCTTACTCTTATA 
AGTGGTACACTCAATCAAATCTCTAATTGTGATCCTA 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTT 
TTACAACAGTAGTACTAATCGTCAT^^ 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 
CACCAGCCTGGAGTrGACATCGCTGTGGCGGTCATGACTATAACCCTACTGAT 
GGTTAGCTATGTGACAGATTATTTTAGATATAAAAA 
AGCCTGGTATCTGCGGTGTTCnTGATAAGAAGCCT 

GAGATGCCAGAGGTAACTATCCCAAACTGGAGACCACTAACTTTAATACT 
TATTTGATCTCAACAACAATTGTAACGAGGTGGAAGGTTGACGTGGCTGGCCT^ 
TTGTTGCAATGTGTGCCnWTCnTATTGCTGGTCAC 
TAACCCTAATACTGATCCTGCCTACCTATGAATTGGT^ 

AACTXnTAGGACTGATACAGAAAGAAGTTGGCTAGGGGGGATAGACTATACAA 

GAGTTGACTCGATCTACGACGTTGA TGAGA GTGGAGAGGGCGTATATCTTTTTC 

CATCAAGGCAGAAAGCACAGGGGAATTTTI^ 

CAACACTGATAAGTTGCGTCAGCAGTAAATGGCAGCT 

TAACmGGACTTTATGTACTACATGCACAGGAAAGTTATAGAAGAG 

GAGGTACXIAACATAATATCCAGGTTAGTGGCAGCACrCATA 

TCCATGGAAGAAGAGGAGAGCAAAGGCTTAAAGAAGTTITATCT^ 

AAGGTTGAGAAACCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCTT 

CTTGGTACGGGGAGGAGGAAGTCTACXKjTATGCXrAAAGATCATGACTATAATC 

AAGGCCAGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGA 

GGGCCGAGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAG 

CCG ATAA CGTGTGGGATGTGGCTAGCAGATTTTGAAGAAAGACACTATAAAAG 

AATCITTATAAGGGAAGGCAACTT^ 

AGCATAGGAGGTITGAAATGGACXXKKjAACCTAAGAGTGCCAGATACTGTGCT 
G AGTGT AAT AGGCTGC ATCCTGCTG AGG AAGGTG AC 1T1 ' 1 tjGGCAG AGTCG AG 
CATGTTGGGCCTCAAAATCACCTACTITG 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCTCCCCAGATACCCACA 

GAGT CCCT TGTCACATCTCATTTGGTT^ 

ATGGCITIXn'ACAATATACCGCTAGGGGGCAACrATTT 

TACTGGC^CTAAACrrAAAAATGCn^ 

GGTAATCnXJAACATCITGGCTGGATCCrAAGGGGGCCT 

GATCAC^GAGCACGAAAAATGCCACATTAATATACTGGATAAACT 

TTTCGGGATCATGCCAAGGGGGACTACACCCAGAGCCCCGGTGAGGTTCCCTA 

CGAGCnTACTAAAAGTGAGGAGGGGTCTGGAGACTGCCTGGGCrrACACACA 

CAAGGCGGGATAAGTTCAGTCGACXATGTAACCGCCGGAAAAGATCTACTOGT 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAACAGGTTGA 

CCGATGAGACAGAGTATCKjCGTCAAGACTGA^^ 

CAGATGTTATGTGTITAAATCCAGAGGCCGTTAACATATCAGGATCCAAAGGGG 

CAGTCGTTCACCTCCAAAAGACAGGTGGAGAATTCACGTG^ 

GGCACACCGGC144Lll'CGACCTAAAAAACTrGAAAGGATGGTCAGGCITG<XT 

ATATTTGAAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCTAAACXn*ACAAAAATAATGAGTGGAATCXrAGACCGTCTCAAAAA 

ACAGAGCAGACCTGACC XjAGA TGGTCAAGAAGATAACCAGCATGAACAGGGG 

AGACTTCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACAGAACT 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CCATTAAGGGCAGCGGCAGACTCAGTCTACCAGTATATGAGATTGAAACACCC 

AAGCATCTCTTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAA 

CCGGGATAACCTATGCATCATACGGGTACrTCTGCCAAATGCCTCAACCAAA 

TCAGAGCTGCTATGGTAGAATACrcATACATATTCTTAGATGAATACCATTGTGC 

CACTCCTGAACAACTGGCAATTATCGGGAAGATCCAC^GATTTTC 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 

CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGG 
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ATCTTGGTAGTCAGTTC CTTCATA TAGCAGGGTTAAAAATACCAGTGGATGAGA 

TGAAAGGCAATATGTTGGTTnTGTACCAACGAGAAACATGGCAGTAGAGGTA 

GCAAAGAAGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGA 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGA 

CACGGGGTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCA 

TCGTAACAGGCTTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

ACAGCAACAGGGTCAAAGGACTACCACTATGACXTrCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTTTAGGGAGATGAATTACG 

ATTGGAGCCTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTA 

AATAATCTACTCATCTCAGAAGACTTGCCAGCCX3CTGTTAAGAACATAATGGCC 

AGGACTGATCACCXIAGAGCCAATCXZAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTATTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACG 

AAAATTACTCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATA 

TCTACGCTACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGG 

CCTGATCCTGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGGGTTGTCXrrCGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGT 

GGGTTACC^GGCTCTCTCAAAGAGGCATGTCCCAATGATAACAGACATATATAC 

CATCGAGGACCAGAGACTAGAAGACACCACCC^CCTCCAGTATGCACCCAACG 

CGATAAAAACXTGATGGGACAGAGACTGAACTCAAAGAACTGGCGTCXKjGTGA 

CGTGGAAAAAATCATGGGAGCCATTTCAGATTATGCAGCTGGGGGACT^ 

TTGTTAAATCXXIAAGCAGAAAAGATAAAAACAGCTCC 1 1 lO'l'l I AAAGAAAACG 

CAGAAGCCGCAAAAGGGTATGTCCAAAAATTCATTGACTCATTAATTGAAAATA 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGGACTATACAAA 

AGCATAGCrGCAAGACTXjGGGCATGAAACAGCGTTTGCCACACTAGTGTTAAA 

GTGGCTAGCI 1 1 1 GG AGGGGAATC AGTGTCAG ACC ACGTC AAGCAGGCGGC A 

GTTGATTTAGTGGTCTATTATGTGATGAATAAGCCTTCCTTCCCAGGTGACTCC tn 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACT <n 

CK3CAACCTACACATACAAAACTTGGAATTACCACAATCTCTCTAAAGTGGTGGA 

ACCAGCCCTGGCTTACXnGCCCTATGCTACC^GCGCATTAAAAATGTTCACCCC 3 

AACGCGGCTGGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACC 5 

TCTCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC O 

AGCCATGGAAATCCTGTCACAAAACCCAGTATCGGTAGGTATATCTGTGATGTT ST 

GGGGGTAGGGGCAATCGCTGCXjCACAAOGCTATTGAGTCCAGTGAACAGAAA 

aggaccctacttatgaaggtgtttgtaaagaacttcttggatcaggctgcaaca 

gatgagctggtaaaagaaaacccagaaaaaattataatggccttatttgaagca 

gtccagacaattggtaaccccctgagactaatataccacctgtatggggtttac 

tacaaaggttgggaggccaaggaactatctgagaggacagcaggcagaaact 

tattcacattgataatgtttgaagccttcgagttattagggatggactcacaag 

ggaaaataaggaacctgtccggaaattac^ttttggattrgatatacggcctac 

aca agca aatcaacagagggctgaagaaaatggtactggggtgggcccctgc 

accctttagttgtgactggacccctagtgacgagaggatcagattgccaacag 

acaactatttgagggtagaaaccaggtgcccatgtggctatgagatgaaagct 

ttcaaaaatgtaggtgggaaacntaccaaagtggaggagagcgggcctttcct 

atgtagaaacagacctggtaggggaccagtcaactacagagtcaccaagtatt 

acgatgacaacctcagagagataaaaccagtagcaaagttggaaggacaggta 

gagcactactacaaaggggtcacagcaaaaattgactacagtaaaggaaaaat 

gctcttggccactgacaagtgggaggtggaacatggtgtcataaccaggttag 

craagagatatactggggtcgggttcaatggtgcatacttaggtgacgagccc 

aatcaccgtgctctagtggagagggactgtgcaactataaccaaaaacacagt 

acagtttctaaaaatgaagaaggggtgtgcgttcacctatgacctgaccatctc 

caatctgaccaggctcatcgaactagtacacaggaacaatcttgaagagaagg 

aaatacccaccgctacggtcaccacatggctagcttacaccttcgtgaatgaag 
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acgtagggactataaaaccagtactaggagagagagtaatcxzccgaccctgta 
gttgatatcaatttacaaccagaggtgcaagtggacacgtcagaggttgggat 

CACAATAATTGGAAGGGAAACCCTGATGACAACGGGAGTGACACCTCjTCTTGG 
AAAAAGTAGAGCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTG 

gatgagggtaattacccagggcctggaatacagacacatacactaacagaaga 

aatacacaacagggatgcgaggcccttcatcatgatcctgggctcaaggaatt 

ccatatcaaataggggaaagactgctagaaatataaatcrgtacacaggaaatg 

accccagggaaatacgagacttgatggctgcagggcgcatgttagtagt agca 

ctgagggatgtcgaccctgagctgtctgaaatggtcgatttcaaggggacttt 

tttagatagggaggccctggaggctctaagtctcgggcaacctaaaccgaagc 

aggttaccaaggaagctgttaggaatttgatagaacagaaaaaagatgtggag 

atccctaactggtttgcatcagatgacccagtatttctggaagtggccitaaaa 

aatgataagtactacttagtaggagatgttggagagctaaaagatcaagctaaa 

GCACrTGGGGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGA 

CGTATGCCATGAAGCTATCTAGCrGGTTCCTCAAGGCATCAAACAAACAGATGA 

GTlTAACTCXZACTGTrTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAGA 

GCAATAAGGGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAG 

CCCCTCGGTTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAGAT 

ACACXXATATGAAGCTTACCTGAAGTTGAAAGATTTCATAGAAGAAGAAGAGAA 

GAAACCTAGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACTTA 

AAAAAATAAGGTITCAAGGAAACXnTCAACACCAAGAAAATGCTCAACCCAGGG 

AAACTATCTGAACAGTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCA 

CCAGATTGGTACTATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCCAA 

TAGTGAGGGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAG 

ATAGACAAGAGTGAAAACXXK3CAAAATCCAGAATTGCACAACAAATTGTTGGA 

GATTTrCCACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGA 

CGTGGGAGCAACTTGAGGCGGGGGTAAATAGAAAGGGGGCAGCAGGCTTCCT 

GGAGAAGAAGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGAAC 

AATTGGTCAGGGATCTGAAGGOCGGGAGAAAGATAAAATATTATGAAACTGCA n 

ATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGACC rj 

TGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAAGG 

CTAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTrGTGATT 3 

CCAGGATATGAAGGAAAGACCCCCTTGTTCAACATCTTTGATAAAGTGAGAAAG 3 

GAATGGGACTCGTTCAATGAGCCAGTGGCCGTAAGTITTGACACCAAAGCCTG O 

GGACACTCAAGTGACTAGTAAGGATCTGCAACTrATrGGAGAAATCCAGAAATA £ 

TTACTATAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACATGAC 

AGAAGTACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGGCAGA 

GAGGGAGCGGCCAGCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCCT 

GACAATGATGTACGGCTTCTXKXjAAAGCACAGGGGTACCGTACAAGAGTTTCA 

ACAGGGTGGCAAGGATCCACGTCTGTGGGGATGATGGCTTCTTAATAACTGAA 

aaagggttagggctgaaatttgctaacaaagggatgcagattcttcatgaagc 

AGGCAAACCTCAGAAGATAACXjGAAGGGGAAAAGATGAAAGTTGCCTATAGAT 

ttgaggatatagagttctgttctcataccccagtccctgtraggtggtccgaca 

acaccagtagtcacatggccgggagagacaccgctgtgatactatcaaagatg 

gcaacaagattggattcaagtggagagaggggtaccacagcatatgaaaaagc 

ggtagccttcagtttctroctgatgtattcctggaacccxkntgttaggaggat 

ttgcctgttggtcctttcgcaacagccagagacagacccatcaaaacatgccac 

ttattattacaaaggtgatccaataggggcctataaagatgtaataggtcggaa 

tctaagtg aactg aaga gaacaggctttg ag aaattggc aaatctaaacctaag 

cctgtccacgttggggg1 ctggactaagcacacaagcaaaagaataattcagg 

acrgtgttgccattgggaaagaagagggcaactggctagttaa gccc gacagg 

CTGATATCCAGCAAAACIGGCCACTTATACATACCTGATAAAGGCTTTACATTAC 
AAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCGGTCATG 
GGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTGCTGCTGAG 
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AAGGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAgacaaaatgtat 

atattgtaaataaattaatccatgtra^ 

ctaaacagugtcaagattatctacctcaagataacactacatttaat 

ttggactagggaagacctctaacagccccc 



i 

s 
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Gtatacgagaattagaaaaggcactcgtatacgtangggcaatlaaaaataataattaggcctaggmcatggcacgtgccagccccct 

gatgggggcgacaciccaccatgaatcaacccctgtgaggaactactgtcttcacgcagaaagcgtctagccmggcgtta 

tgtcgtgcagcctccaggaccccaxtcccgggagagccatagtggtctgcggaac^ 

cgggtcctttcttggataaacccgctcaatgcctggagamgggcgtgcccccgc^ 

aggccttgtggtactgcctgatagggtgcttgcgagtgcra 

CTAAACCTCAAAGAAAAACC^AACXn'AACACCAACXGTCGOCCACAGGACGTC 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACITGTTGCCGCGCAG 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGACGTCAGCCTATCCCCAAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGCKTrCAGCCCGGGTACCCTTGGCCCCTCTATGGCi^ 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGATACCCTTAC^ 

GCGGCITCXjCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTTGGA 

GGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGACGGCGTGA 

ACTATGCAACAGGGAACCTTCCTGGTTGCTCTT^ 

CKTTCTCTTGCCTGACCGTGCCCXX^ 

GCTTTACCATGTCACCAATGATTGCCCTAACTCGAGTATTGTGTACXj 
CGATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTTCGCGAGGGTAACG 
CCTCGAGGTGTTGGGTGGCGGTGACCCCCACGGTGGCCACCAGGGACGGCAA 
ACTCCCCACAACGCAGCTTCX}A(X^^ 

CCCTCTGCTCGGCCCTCTACGTGGGGGACCTGTGCGGGTCTC 

GTCAACTGTTTACCTTCTCTCCCAGGCGCGACTGGACGACGCA^ ~ 
GTTCTATCTATCCCGGCCATATAACXXX3TCATCGCATGGCATGGGATATGATGA 3 
TGAACTGGTCCCCTACXKKIAGCGTTGGTGGTAGCrCAGCrGCTCCGGATCCCA £■ 
CAAGCCATCATGGACATGATCGCTGGTGCTCACTGGGGAGTCCTGGCGGGCAT g 
AGCGTATTTCTCCATGGTGGGGAACTGGGCGAAGGTCCTGGTAGTGCTGCTGC g 
TATTTGCCGGCGTCGACGCGGAAACCCACGTCACCGGGGGAAGTGCCGGCCG O 
CACCACGGCTGGGCTTGTTGGTCnX^CrTACACCAGGCGCCAAGCAGAACATCC £ 
AACTGATCAACACCAACGGCAGTTGGCACATCAATAGCACGGCCnTGAACTGC 
AATGAAAGCCTTAACACCGGCIXKnTACKTAGGG 
AACTCTTCAGGCnXjTCCrGAGA 

GCCCAGGGCTGGGGTCCTATCAGTTATGCCAACGGAAGCGGCCTCGACGAAC 
GCCCCTACTGCTGGCACTACCC^ 

AGCGTGTGTGGCCCGGTATATTGCTTCACTCCCAGCCCCGTGGTGGTGGGAAC 
GACCGACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGCAAATGATACGGAT 
GTCTTCGTCCTTAACAACACCAGGCCACrc 

TGGATGAACTCAACTGGATTCACCAAAGTGTGCGGAGCGCCCCCTTGTGTCAT 
CGGAGGGGTGGGCAACAACACCTIXjCTCTGCCCCACTGATTGTTTCCGCAAGC 
ATCCGGAAGCCACATACTCTCGGTGCGGCTCCGGTCCCTGGATTACACCCAGG 

tgcatgk3tcgactacccgtatackktitggcactatcot 

accatattg\aagtcaggatgtacgtgggaggggtcgagcacaggctggaag 

cggcctgcaactggacgcggggcgaacgctgtgatctggaagacagggacag 

gtccgagctcagcccattgctgctgtccaccacacagtggcaggtccttccgt 

gttctttcacgaccctgccagcxtrgtccaccggcctcatccacctccaccag 

acattgtggacgtgcagtacttgtacggggtagggtcaagcatcgcgtcctgg 

gccattaagtgggagtacgtcgttctcctgttcctcctgcitgcagacgcgcgc 

gtctgctcctgcitgtggatgatgt^^ 

gagaacctcgtaatactcaatgcagcatccctggccgggacgcacggtcttgt 
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GTCCTTCCTCGTGTTCITCTGCITTGCCT 

CGGAGCGGTCTACGCCTTCrACGGGAAGTGGGTCTTACrCTTATACCA 
AGTGGTACACCCAATCAAATCTGTAATTGTGATCCTACT 

GGTAAAGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTT 

TTACAACAGTAGTACTAATCGTCATAGGTTTAATCATAGCTAGGCGTGACCCAA 

CTATAGTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTGAACTGACC 

CACCAGCCTOGAGTTGACAT CGCTG TGGCGGTCATC 

GGTTAGCTATGTGACAGATTATTTTAGATATAAAAAATGGT^ 

AGCCTGGTATCTCCGGTGTTCTIXjATAAGAAGCXn'AATA 

GAGATGCCAGAGGTAACTATCCCAAACTGGAGA(XACT 

TATTTGATCTCAACAACAATTGTAACXjAGGTGG 

TTGTTGCAATGTGTGCCTATCTTATTGCTGGT 

TAACCCTAATACTX3ATCCTCKXT^ 

AACTGTTAGGACTGATACAGAAAGAAGTTGGCTAGGGGGGATAGACT 

GAGTTGACTCCATCTACGACGTTGATGAGAGTGGAGAGGGCCT 

CATCAAGGCAGAAAGCACAGGGGAATTTTTCTATACTC^ 

CAACACTGATAAGTTGCGTCAGCACT^ 

TAACITTGGACirTATGTACTACATGCA^ 

GAGGTACCAACATAATATCCAGGTTAGTGGCAGCAC ^^ 

TCX^TGGAAGAAGACKlAGAGCAAAGGCITAAAGAAGTm 

AAGGTIX3AGAAACXTAATAATAAAACATAAGGTAAGGAATG 

CTIX3CTACXK3GGAGGAGGAAGTCTACGGTAT^^ 

AAaSCCAGTACACTCAGTAAGAGCAGG^ 

GGGCCGAGAGTGGAAAGGTGGCACCKKXCA^ 

COGATAACGTGTGGGATGTCGCTAGCAGATTTTGAAGAAAGAC^ 

AATCTITATAAGGGAAGGCAACTT^ 

AGCATAGGAGGTTTGAAATGGACXXjGGAACCTAAGAGTGC 

GAGTGTAATAGGCTGCATCCTGCTGAGGAA 

CATGTTGGGCCTCAAAATCACCTACITIX3CG 

TATCACAGAGTGGGCTGGATGCCAGCGTGTGGGAATCnX: 

GAGTCXXnTGTCACATCTCATTTGGTTC^ 

ATGGCITTGTACAATATACXXKTAGGGGGCAACrATTTCTG 

TACIXX3CAACTAAAGTAAAA^ 

GGTAATCTGGAACATCTTGGGTGGATCCTAAG 

GATCACAGAGCACGAAAAATGCCACATTAATATACTGGATAAACT 

TTTCGGGATCATGCCAAGGGGGACTACAC(XAGAGCX^CCX3GTO 

CGAGCITACTAAAAGTGAGGAGGGGTCTGGAGACTGCCT 

CAAGGCX3GGATAAGTTCAGTCGACX^TGTAACCGCCG^ 

CTGTGACAGCATGGGACGAACTAGAGTGGTTTGCCAAAGCAACAAC^ 

CCGATGAGACAGAGTATGGCGTCAAGACTGACTCAGGGTGCCCAGACGGTGC 

CAGATGTTATGTGTTAAATCrAGAGGCCCnTAACATATCA 

CAGTCGTTCACCTCXZAuAAAGACAGGTGGAGAATTCAC^ 

GGC\CA(XX3GCTITCTra 

ATATTTGAAGCCTCC^GCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGA 

ATGAAGAGTCTAAACCTACAAAAATAATGAGTGGAATCCAG 

ACAGAGCAGACCTGACCGAGATGGTCAAGAAGATAACCAGCATGAACAGGGG 

AGACITCAAGCAGATTACTTTGGCAACAGGGGCAGGCAAAACCACA 

CAAAAGCAGTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATA 

CX^TTAAGGGCAGCGGCAGAGTCAGTCTACCAGT^ 

AAGCATCTCITTTAACCTAAGGATAGGGGACATGAAAGA 

CCGGGATAACX7TATGCATCATACX3GGTACTTCTGCCAAATGCCT 

TCAGAGCrcCTATGGTAGAATACTCATACATATTCTTAGATGAATA^ 

CACTCCTGAACAACTGGCAATTATCGGGAAGATCCACAGATTTT 

AAGGGTTGTCGCCATGACTGCCACGCCAGCAGGGTCGGTGACCACAACAGGT 
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CAAAAGCACCCAATAGAGGAATTCATAGCCCCCGAGGTAATGAAAGGGGAGG 

ATCTTGGTAGTCAGTT(XTIX3ATATAGCAGGGTTAAAAATACCAGTGGATGAGA 

TGAAAGGCAATATGTTGGTTTTTGTACCAACGAGAAACATGGCAGTAGAGGTA 

GCAAAGAAGCTAAAAGCTAAGGGCTATAACTCTGGATACTATTACAGTGGAGA 

GGATCCAGCCAATCTGAGAGTTGTGACATCACAATCCCCCTATGTAATCGTGGC 

TACAAATGCTATTGAATCAGGAGTGAGACTACGAGATTTGGACACGGTTATAGA 

CACGGGGTTGAAATGTGAAAAGAGGGTGAGGGTATCATGAAAGATACCCTTCA 

TCGTAACAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCG 

TAGGGGCAGAGTAGGTAGAGTGAAACCCGGGAGGTATTATAGGAGCCAGGAA 

AC^GCAACAGGGTCAAAGGACTACCACTATGACCTCTTGCAGGCACAAAGATA 

CGGGATTGAGGATGGAATCAACGTGACGAAATCCTrTAGGGAGATGAATTACG 

ATTGGAGCCTATACGAGGAGGACAGCCTACTAATAACCCAGCTGGAAATACTA 

AATAATCTACrCATCrcAGAAGACTrGCCAGCGGCTGTTAAGAACATAATGG<X 

AGGACTGATCACOCAGAGCCAATCCAACTTGCATACAACAGCTATGAAGTCCA 

GGTCCCGGTCCTGTTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACG 

AAAATTACTCGTTTCTAAATGCCAGAAAGTITAGGGGAGGATGTGCCCGTGTATA 

TCTACGCTACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGG 

CCTGATCCTGGGAAGCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGT 

GACCGGGTTGTCCTCGGCroAAAATGCCCTACTAGTGGCTTTATTTGGGTATGT 

GGGTTACCAGGCrcrCTCAAAGAGGCATGTGCCAATGATAACAGACATATATAC 

CATOGAGGACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACG 

CCATAAAAACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGA 

CGTGGAAAAAATCATGGGAGCX]ATTTCAGATTATGCAGCrGGGGGACTGGAGT 

TTGTTAAATCXZXZAAGCAG AAAAG AT AAAAACAGC" 1 LXJ 1 ' l'1 1/ 1 ' 1 1 AAAG AAAACG 

CAGAAGCCGCAAAAGGGTATGTCCAAAAATTCATTGACTCATTAAT^ 

AAGAAGAAATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAA 

AGCATAGCTGCAAGACTGGGGCATGAAACAGCGTTTGCCACACTAGTGTTAAA «? 

GTGGCTAGCTTTTGGAGGGGAATCAGTGTCAGACCACGTCAAGCAGGCGGCA ^ 

GTTOATTTAGTGGTCTATTATGTGATGAATAAGCCTTCCrrCCCAGGTC ^ 

GAGACACAGCAAGAAGGGAGGCGATTCGTCGCAAGCCTGTTCATCTCCGCACT 2 

GGCAACCTACACATACAAAACTrGGAATTACCACAATCTCTCTAAAGTGGTGGA -p 

ACCAGCCCTGGCTTACCTCCOCTATGCTACX^GCGCATTAAAAATGTTCACCCC O 

AACGCGGCTCGAGAGOGTGGTGATACTGAGCACCAGGATATATAAAACATACC £ 

TCTCTATAAGGAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGC 

AGCCATGGAAATCCTGT^CAAAACCCAGTATC3GGTAGGTATATCTGTGATGTT 

GGGGGTAGGGGCAATCGCTGCGCAC^CGCTATTGAGTCCAGTGAACAGAAA 

AGGACCCT ACTTATG AAGGTGTTTGTAAAG AAC 1 ' 1 L' 1 ' 1 GGATCAGGCTGC AAC A 

GATGAGCTGGTAAAAGAAAACCCAGAAAAAATTATAATGGCOTATTTGAAGCA 

GTCCAGACAATTGGTAACCCCCroAGACTAATATACCACCTGTATGGGGTTTAC 

TACAAAGGTTGGGAGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACT 

TATTCACATTGATAATGTTTGAAGCCTTCGAGTTATTAGGGATGGACTCACAAG 

GGAAAATAAGGAAOCTGTCOGGAAATTACATTTTGGATTTGATATACGGCCTAC 

ACAAGCAAATCAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGC 

ACCCTTTAGTTGTGACIGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAG 

ACAACTATTTGAGGGTAGAAACCAGGTGCCCATGTGGCTATGAGAT OAAAG CT 

TTCAAAAATGTAGGTCGCAAACnTACCAAAGTGGAGGAGAGCGGGCCTTTCCT 

ATGTAGAAACAGACCTGGTAGGGGACCAGTCAACTACAGAGTCACCAAGTATT 

ACGATGACAACCTCAGAGAGATAAAACCAGTAGCAAAGTTGGAAGGACAGGTA 

GAGCACTACTACAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAAT 

GCTCTTGGCCACTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAG 

CTAAGAGATATACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCCC 

AATCACCGTGCTCTAGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGT 

ACAGTTTCTAAAAATGAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTC 

CAATCTGACCAGGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGG 
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AAATACCCACCGCTACGGTCACCACATGGCTAGCTTACACCITCGTGAATGAAG 

ACGTAGK3GACTATAAAAOCAGTACTAGGAGAGAGAGTAATCCCCGAOCCTGTA 

GTTGATATCAATTTACAACCAGAGGTGCAAGTGGACACGTCAGAGGTTGGGAT 

CACAATAATTGGAAGGGAAACCCIXjATCACAACGGGAGTGACACCTGTCTTGG 

AAAAAGT AG AGCCTG ACGCCAGCG AGAACCAAAACXCXX3TG AAG ATCGGGTTG 

GATGAGGGTAATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGA 

AATACACAACAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATT 

CCATATCAAATAGGGCAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATG 

ACCCCAGGGAAATACGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCA 

CTGAGGGATGTCGACCCTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACTTT 

TTTAGATAGGGAGGCGCTGGAGGCTCTAAGTCTCGGGCAACXJrAAACCGAAGC 

AGGTTACCAAGGAAGCKJITAGGAATITGATAGAACAGAAAAAAGATGTGGAG 

ATCXXn"AACTGGTTTXjCATCAGATGACCCAGTATTrCTGGAAGTGGCCTTAAAA 

AATGATAAGTACTACTTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAA 

AGCACrTGGGGCCACGGATCAGACAAGAATrATAAAGGAGGTAGGCTCAAGG 

ACGTATGCCATGAAGCTATCTAGCTGGTKXnCAAGGCATCAAACAAACAGATG 

AGTTTAACTCXIACTGTTTGAGGAATTGTIXKn'ACGGTGCCCACCTGCAACTAAG 

AGCAATAAGGGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGA 

GCCCCTCGGTTGCGGGGTGCACCTAGGTACAATACCAGCCAGAAGGGTGAAG 

ATACACCCATATGAAGCTTA(XTGAAGTraAAAGATTTCATAGAAGAAGAAGAG 

AAGAAACCTAGGGTTAAGGATACAGTAATAAGAGAGCACAACAAATGGATACT 

TAAAAAAATAAGGTITCAAGGAAACCTCAAC^OCAAGAAAATGCTCAAOCCTG^ 

GAAACTATCKjAACAGTTGGAC\GGGAGGGGCGCAAGAGGAACATCTACAAC 

CACXIAGATTGGTACTATAATGTIX^AGTGCAGGCATAAGGCTGGAGAAATTGCC 

AATAGTGAGGGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATA 

AGATAGACAAGAGTGAAAACCGGCAAAATCCAGAATTGCACAACAAATTGTTG 

GAGATTTTCXIACACGATAGCXZCAACCCACCCrGAAACACIACCTACXXnX} 

GACGTGGGAGCAACTTGAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCTTC 4. 

CTXXjAGAAGAAGAACATCGGAGAAGTATTGGATTCAGAAAAGCACCTGGTAGA «m 

ACAATTGGTCAGGGATCTGAAGG(XGGGAGAAAGATAAAATATTATGAAACTG W 

CAATACCAAAAAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGA « 

CCTGGTGGTTGAGAAGAGGCCAAGAGTTATCCAATACCCTGAAGCCAAGACAA £ 

GGCrAGCCATCACTAAGGTCATGTATAACTGGGTGAAACAGCAGCCCGTTGTG S 

ATTCCAGGATATGAAGGAAAGACCCCCTTGTTCAAC^TCTTTGATAAAGTGAGA fa 

AAGG AATGGG ACTCGTTCAATG AGCCAGTGGCCGTAAU 1 1 1 1 1 ' 1 G ACACCAAAGC 

CTGGGACACTCAAGTGACTAGTAAGGATC1X3CAACITATTGGAGAAATCCAGA 

AATATTACTATAAGAAGGAGTGGCACAAGTTCATTGACACCATCACCGACCACA 

TGACAGAAGTACCAGTTATAACAGCAGATGGTGAAGTATATATAAGAAATGGG 

CAGAGAGGGAGCGGCCAGCCAGACAQ^AGTGCTGGCAACAGCATGTTAAATG 

TCCIXjACAATGATGTACGCCTTCTGCGAAAGCACAGGGGTACCXjrACAAGAGT 

TTCAACAGGGTGGCAAGGATCCACGTCTGTGGGGATGATGGCTTCTTAATAAC 

TGAAAAAGGGTTAGGGCTGAAATTTGCTAACAAAGGGATGCAGATTCTTCATG 

AAGCAGGCAAACCTCAGAAGATAAOGGAAGGGGAAAAGATGAAAGTTCCCTAT 

AGATTTGAGGATATAGAGTTCnjTTCTCATACCCCAGTCCCTGTTAGGTGGTCC 

GACAACACCAGTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAA 

GATGGCAACAAGATTGGATrCAAGTGGAGAGAGGGGTACCACAGCATATGAAA 

AAGCGGTAGCCTTCAGTTTCTTGCTGATGTATTCCTGGAACCOjCTTGTTAGGA 

GGATTTGCCTGTIXXnXXTTTCGCAACAGCCAGAGACAGACCX!ATCAAAACATG 

CCACTTATTATTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTC 

GGAATCTAAGTGAACTGAAGAGAACAGGCTTTCAGAAATTGGCAAATCTAAAC 

CTAAGCCTGTCCACGTTGGGGATCTGGACTAAGCACACAAGCAAAAGAATAAT 

TCAGGACTGTGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCG 

ACAGGCTGATATCCAGCAAAACTGGCCACTTATACATACCTGATAAAGGCTTTA 

CATTACAAGGAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCG 
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GTCATGGGGGTTGGGACTGAGAGATACAAGTTAGGTCCCATAGTCAATCTC 

GCTGAGAAGGTTGAAAATTCTGCTCATGACGGCCGTCGGCGTCAGCAGCTGAg 

acaaaatgtatatattgtaa^^ 

TTACTGGCCGAAGCCGCT TGGA ATAAGGCCGGTGTGCGTTTGTCTATATGTTAT 
TITCCACCATATTGCCGTCnTITGGCAATGTGAGGG^ 

tcttcttgaotagcattcctacksggtcitrc 

gtctgttg aatgtcgtg aagg aagcagttcctctgg aagc vic 1'ig aag acaaa 
caacgtctgtagcgaccctttck:aggcagcggaaccccccacctggcgacagg 
tgcctxhtjcggcxaaaagccacgtct 

ACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTC^^ 

CXTTCAAGCGTATTCAACAAGGGGCIXjAAGGATGCCXIAGAAGGTACCCCATTOT 

ATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTT 

GTTAAAAAACXnXJTAGGCCC(XXX5AACC^ 

AACACGATGATAAGCTTGCCACAACcatgaccga^ 

cgtcccccgggccgtacgcaccctegccgc^^ 

gagcgggtcaccgagctgcaagaactcttcctcacgcgcgtcgggctcga 

gcggtggcggtctggaccacgccggagagcgtcgaagcgggggcggtgttcgccgagatcggcccgcgca 

cggttcccggaggccgcgcagcaacagatggaaggcctcctggcgccgcac^ 

cgtcggcgtctcgca:gaccacca^ggcaagggtctgggcagcgccgtcgt 

gggtgcccgccttcctggagacctccgcgra 

gcccgaaggaccgcgcgacctggtgcatgacccgc^ 

aaaggagcgcacgaccccatgaaATCK^iTCGATCGTACXjAATTAA 

(XIAGCAAAACIXXX^CACTTATACATACCTGATAAAC^ 

AGCATTATGAGCAACTGCAGCTAAGAACAGAGACAAACCCQ 

GGGACTCAGAGATACAAGTTAGGTCCCATAGTCAATCT 

GAAAATTCTOCTCATGACGGCCGTCGGOT 

aattaatccatgtacatagtgtatotaaa^^ 

teaaga ttaf rt a rctcaa g a taa r a r ffl ca n^^ 

g aaga c ct c taacag coocc 
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Gtatacgagaattagaaaaggcactcgta^ 
gatgggggcgacactccaccatgaatcactcccctgtgaggaac^^ 
tgtcgtgcagcctccaggaocccccctcccgggagagccam 
cgggtcctttcttggataaacccgctcaatgc^ 

aggccttgtggtactgcctgatagggtgcttgcgagtgccccgggaggtctcgtagacc^ 

CTAAACCTCAAAGAAAAACCAAACGT^ 

AAGTTCCCGGGTGGCGGTCAGATCGTTGGT^ 

GGGCCCTAGATTGGGTGTGCGCGCGACGAGGAAGACTTCCGAGCGGTCGCAA 

CCTCGAGGTAGA(XrrCAGCCTATCCCX:AAGGCACGTCGGCCCGAGGGCAGGA 

CCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGTTGCGGG 

TGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGGCCTAGCTGGGGCCCCAC 

AGACCXTCXXKjCGTAGGTCGCGCAATTTGGCT 

gcggcitcgccxjaccnx3atggggtacataccgctcg 

ggcgctgcx:agggcccixxkxk:atggcgtccgggttct 

actatgcaacagggaaccttccrggttxjcixjitt^ 

CKTCTCTIXXXJrcACCGTGCCCC^ 

gctttacxiatgtcaccaatgattgcxxtaacrc 

cgatgccatcctggacactxxxx^ 

cctcgaggtgttgggtggcxkti^ 

ACTCCCCACAACXK^GCirOjACGTCAT^ 

CCCTCTGCTCGGCCCTCTACXnXKKKK3ACCTGTGK^^ _ 

GTCAACTGTTTAOCTTCTCTCXXIAGGC 3 

GTTCTATCTATXXXXKjCCATATAAOGGGTC^ is 

TGAACTGGTCXCCTACGGCAGKXriTGGTGCT W 

CAAGCCATCATGGACATGATCXXnTK3TGCT « 

AGCGTATTTCTCCATGGTCK^ 3 

TATTTGCCGGCGTCGAOGCGGAA^ S 

<^CCACGGCTCK3GCTTGTTG^ fa 
AACTX5ATCAACAOCAACXKXIAGTTGGCACATCAATAG 
AATGAAAGOCTTAACACCGGCTGGTTAGC^ 

GCCCAGGGCTGGGGTCCTATCAGTTATGCCAACGGAAGCGGCCT 

GCCCCTACTGCTGGCACTACCXTCCAAGACCTTGTGGCAT^ 

AGCGTGTGTGGCCCGGTATATTGCITCACTCCCAGCCCCGTGGTGGTGGGAA 

GACCX3ACAGGTCGGGCGCGCCTACCTACAGCTGGGGTGC 

GTCTTCGTCCTTAACAACACCAGa 

TGGATGAACTCAACTTjGATTCACCAA^ 

CGGAGGGGTGGGCAACAACACCTTGCTCTGCCCCACTGATTGTTTCCGCAAGC 
ATCCGGAAGCCACATACTCTCGGTGCGGCTCCXXn'CCCTGGATTACACCCAGG 
TGCATGGTCX3ACTACXXXn , ATAGGCITIXK3CACT 

ACCATATTCAAAGTCAGGATGTACXnXK3GAGGGGTCGAGCACAGGCTGGAAG 

CGGCCTGCAACTGGACGCXjGGGCGAACGCnXjTGATCTGGAAGACAGGGACAG 

GTCCGAGCTCAGCCC^TTCC^^ 

GTTCTTTCACGACCCTGCCAGCC^ 

ACATTGTGGACGTGCAGTACTTGTACGGGGTAGGGTCAAGCATCGCGTCCTTO 

GCCATTAAGTGGGAGTACGTCGTTCTCCTGTTC 

GTCTGCIXXnX3CTTGTGGATGATGTTACT 

GAGAACCTCXn^AATACTCAA TGCA GCATCCCTGGCCGGGACGCACGGTCTrOT 
GTCCTTCCTCGTGTTCTTCTGCTTTGCGTGGT 
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CGGAGCGGTCTACGCCTTCTACGGGATGTGGCCTCrCCTCCTGCrCCTGCTGG 

CGTTGCCTCAGCGGGCATACGCACTGGACACOGAGGTGGCCXKX5TCGTGTGG 

CGGCXi 1 ' 1 U l'l IT 1 GTCGGGTTAATGGCGCTG ACTCTGTCGCC ATATTACAAGCG 

CTACATCAGCTGGTGCATGTGGTGGCTTt^GTATTTTCTGACCAGAGTAGAAGC 

GCAACTGCACGTGTGGGTTCCCCCCCTCAACGTCCXjGGGGGGGCGCGATGCC 

GTCATCTTACTCATGTGTGTTGTACACCCGACTCTGGTATTTGACATCACXrA 

TACTCCTGGCCATCTTCGGACCCCTrTTGGATTCTTCAAGCGAGTTTGCTTAAAGT 

CCCCTACTTCGTGCGCGTTCAAGGCCTTCTCCGGATCTGCGCGCTAGCGCGGA 

AGATAGCCGGAGGTCATTACGTGCAAATGGCCATCATCAAGTTAGGGGCGCTT 

ACTGGCACCTATGTGTATAACCATCTCACCCCTCTTCGAGACTGGGCGCACAAC 

GGCCTGCGAGATCTGGCCGTGGCTGTGGAACCAGTCGTCTTCTCCCGAATGGA 

GACCAAGCTCATCACXjTGGGGGGCAGATACCGCCGCGTGCGGTGACATCATC 

AACGGCrTGCCCGTCTCTGCCCGTAGGGGCCAGGAGATACTGCTTGGGCCAGC 

CGACGGAATGGTCTCCAAGGGGTGGAGGTrGCTGGCGCCCATCAOjGCGTAC 

GCCCAGCAGACGAGAGGCCTCCTAGGGTGTATAATCACCAGCCTGACTGGCCG 

GGACAAAAACCAAGTGGAGGGTGAGGTGCAGATCGTGTCAACTGCrACCCAAA 

GCTTOCTGGCAACGTGCATCAATGGGGTATGCTGGACTGTCTACCAGGGGGCC 

GGAACGAGGACCATCGCATCACCCAAGGGTCCTGTCATCCAGATGTATACCAA 

TGTGGACCAAGACCirGTGGGCTGGCCOGCTCCTCAAGGTTCGCGCTCATTG 

CACCCTGCACXnX3CGGCTCCTCGGA<XTTTACCTGGTCACGAGGCACGC^ 

GTCATTCCCGTGCGCCGGCGAGGTGATAGCAGGGGTACKXTGCTTTCGCCCCG 

GCCCATTTCCTACTTGAAAGGCTCCTCGGGGGGTCCGCTGTTGTG^ 

GACACXjCCGTGGGCCTATTCAGGGCCGCGGTGTGCACCCGTGGAGTGGCTAA 

GGCGGTGGACTTTATCCCTGTGGAGAACCTAGAGACAACCATGAGATCCCGGG 

TGTTCACGGACAACTXICTCTCCACCAGCAGTGCCCC^ ^ 

CACCTGCATGCTCCX^CXXKKrAGCGGTAAGAG^CX^AGGTCCCGGCTGCGTA J> 

CGCAGGCCAGGGCTACAAGGTGTrGGTGCTCAACOCCTCTGTTGCTGCAACGC « 

TGGGCTrTGGTGCTTAGATGTCCAAGGCCCATGGGGTrGATOCTAATATC^ W 

(XXiGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCAC^ e 

AAGTTOCTrGCCGACGGCXKXnXXn'CAGGAGGTGCTrATGACATAATAATTTGT K 

GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATCGGCACTGTCCT 55 

TGACCAAGCAGAGACTGCGGGGGCGAGACIXKnTGTGCTCGCCACTGCTACC " 

CCTCCGGGCTCCGTCACTGTGTCCCATCCTAACATCGAGGAGGTIXjCTCTGTCC 

ACCACCGGAGAGATCCCCTTTTACGGCAAGGCTATCCCCCTCGAGGTGATCAA 

GGGGGGAAGACATCTCATCTTCnX3CX}ACTCAAAGAAGAAGTGCGACGAGCTCG 

CCGCGAAGCTGGTCGCATTGGGCATCAATGCCGTGGCCTACTACCGCGGTCTT 

GACGTGTCTGTCATCCCGACCAGCGGCGATGTTGTCGTCGTGTCGACCGATGC 

TCTCATGACnXjGCTTTACCGGCGACTTCGACTCTGTGATAGACTCCAACACGTG 

TGTCACTCAGACAGTOGATrTCAGCXrrroACXXTACCTTTACCATTGAGACAAC 

CACGCTCCCOCAGGATGCTGTCTCCAGGACTCAACGCCGGGGCAGGACTGGC 

AGGGGGAAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCG 

GCATGTTCG ACTCXjrcCGTCCTCTGTGAGTGCT ATG ACGCGGGCTG" 1 XJC 1 ' I GG 

TATGAGCTCACGCCCGCCGAGACTACAGTTAGGCTACGAGCGTACATGAACAC 

CCCGGGGCTTCCCGTGTGCCAGGACCATCTroAATTTTGGGAGGGCGTCTTTA 

OGGGC CTCA CTCATATAGATGCCCACTTTCTATCCCAGACAAAOCAGAGTGGG 

GAGAACITTCCTTACCTGGTAGCGTACCAAGCCACCXn'GTGCGCTAGGGCTCA 

AGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATC03CCTTAAAC 

(XACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAAT 

GAAGTCACCCTGACGCACCCAATCACCAAATACATCATGACATGCATGTCGGCC 

GACCTGGAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCXjTCCTGGCTG 

CTCTGGCCGCGTATTGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGATT 

GTCTTGTCCGGGAAGCCGGCAATTATACCTGACAGGGAGGTTCTCTACCAGGA 

GTTCGATGAGATGGAAGAGTGCTCTCAGCACTTACCX3TACATCGAGCAAGGGA 

TXJATGCTCGCTGAGCAGTTCAAGCAGAAGGCCCTCGGCCTCCTGCAGACCGCG 
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TCCCGCCAAGCAGAGGTTATCACCCCTGCTGTCCAGACCAACTGGCAGAAACT 
CGAGGTCTTCTGGGCGAAGCACATGTGGAATTTCATCAGTGGGATACAAT ACTT 
GGCGGGCCTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTT 
TACAGCTGCQjTCACXIAGCCCACTAACX^CTC^^ 

ATTGGGGGGGTGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACCGCC 

TTTGTGGGCGCTGGCTTAGCTGGCGCCGCCATCX3GCAGCGTTGGACTGGGGA 

AGGTCCTCGTGGACATTCTTGCAGGGTATGGCGCGGGCGTGGCGGGAGCTCT 

TGTAGCCTTCAAGATCATGAGCGGTGAGGTCCCCTCCACGGAGGACCTGGTCA 

ATCTGCTGCCCGCCATCCTCTCGCCTGGAGCCCTTGTAGTCGGTGTGGTCTGC 

GCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAATGGA 

TGAACCGGCTAATAGGCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCAC 

TACGTGCCGGAGAGCGATGCAGCCGCCCGCGTCACItjCCATACTCAGCAGCCT 

CACTGTAACCCAGCTCCTGATcg CTAG accatggggtaccgagC GTTA CTGGCCGAAGCC 

GCTTGGAATAAGGCCGGTGTGCGTTTCTCTATATGTTATTTrCCACCATATTGCC 

GTCTTTTGGCAATGTGAGGGCCCGGAAACCTGGCCCTGTCTTCTTGACXjAGCA 

TTCCTAGGGGTCTTTCCOCTCTCGCCAAAGGAATCGAAGGTCTGTTO 

TGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAA(XrrCTGTA 

ACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGACAGGTGCCTCTGCGGCCA 

AAAGCCACGTGTATAAGATAC^OCTGCAAAGGCGGCACAACCCCAGTGCCACX} 

TTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGGCTCTCCTCA^ 

ACAAGGGGCTGAAGGATGCCCAGAAGGTACCCCATTGTATGGGATCTGATCTG 

GGGCCTCGGTGCACATGCTTTAC^TGTGTITAGTCGAGGTTAAAAAACGTCTAG 

GCCCCCCG AACCACGGGG ACQ' 1 GG I'l'l'l GCTTTGAAAAACACG ATGATAATAT 

GGAGTTGATCACAAATCAACTTITATACAAAACATACAAACAAAAACCOGTCGG 

GGTGGAGGAACCTGTrTATGATCAGGCAGGTGATCCCTTATTTGGTGAAAGGG 

GAGCAGTCCACCCTCAATCGACGCTAAAGCTCCCACACAAGAGAGGGGAACGC w 

GATGTrcCAACCAACTTGGCATCCTTACCAAAAAGAGGTGACTGCAGGTCGGG vA 

TAATAGCAGAGGACCTGTOAGCGGGATCTACCTGAAGCCAOGGCCACTATTTT 

ACCAGGACTATAAAGGTCCCGTCTATCACAGGGCCXrCGCIGGAGCTCTrTGAG gj 

GAGGGATCCATGTGTGAAACGACTAAACGGATAGGGAGAGTAACTGGAAGTG g 

ACGGAAAGCTGTACCACATTTATGTGTGTATAGATGGATGTATAATAATAAAAA g 

GTGCCACGAGAAGTTACCAAAGGGTGTTCAGGTGGGTCX1ATAATAGGCTTGAC g 

TGCCCTCTATGGGTCACAAGTTGCTCAGACACGAAAGAAGAGGGAGCAACAaag •* 

cttGCATTGTTGGCGTGGGCAATAATAGCTATAGTITrGTTTCAAGTTACAATGGG 

AGAAAACATAACAC^GTGGAACctgcagTGGTTTGACXZTGGAGGTGACTGACCAT 

CACCGGGATrACTTCGCTGAGTCCATATTAGTGGTGGTAGTAGCCCTCTTGGGT 

GGCAGATATGTACTTIXjGTTACTGGTTACATACATGGTCTTATCAG^ 

GCCTTAGGGATTCAGTATGGATCAGGGGAAGTGGTGATGATGGGCAACTTGCT 

AACCCATAACAATATTGAAGTGGTGACATACTTCTrGCTGCTGTACCTACrGCT 

GAGGGAGGAGAGCGTAAAGAAGTGGGTCTTACTCTTATACCACATCTTAGTGG 

TACACCCAATCAAATCTGTAATTGTGATCCTACTGATGATTGGGGATGTGGTA^ 

AGGCCGATTCAGGGGGCCAAGAGTACTTGGGGAAAATAGACCTCTGTTTTACA 

ACAGTAGTACTAATOGTCATAGGTTTAATCATAGCTAGGCGTGACCX1AACTATA 

GTGCCACTGGTAACAATAATGGCAGCACTGAGGGTCACTCAACTGACCCACCA 

GCCTGGAGTTGACATCGCTGTGGCGGTCATGACTATAACCCTACTGATGGTTA 

GCTATGTGACAGATrATTTTAGATATAAAAAATGGTTACAGTGCATTCrCAGCCT 

GGTATCTGGGGTGTTCTIX3ATAAGAAGCCTAATAT ACCTA GGTAGAATCGAGAT 

CKXIAGAGGTAACTATCCCAAACTGGAGACCACTAACTTrAATACTATTATATTTG 

ATCTCAACAACAATTGTAACGAGGTGGAAGGTIX3ACGTGGCTGGCCTATTGTT 

GCAATGTGTGCCTATCTT ATTGCTGGTC ACAACCTTGTGGGCCG AC" 1 ' I U r 1 AAC 

CCTAATACTGATCCTGGCTACCTATGAATIGGTTAAATTATACTATCTGAAAACT 

GTTAGGACTGATATAGAAAGAAGTTGGCTAGGGGGGATAGAC TATACAA GAGT 

TGACTCCATCTACGACGTTGATGAGAGTGGAGAGGGCGTATATCrnTTCCATC 

AAGGCAGAAAGCACAGGGGAATTTTTCTATACTCTTGCCCCTTATCAAAGCAAC 
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ACTCAT AAGT TGCGTCAGCAGTAAATGGCAGCTAATATACATGAGTC 

TTTGGACTTTATGTACTACATGCACAGGAAAGTTATAGAAGAGATCTCAGGAGG 

TACCAACViTAATATCCAGGTTAGTGGCAGCACTCATAGAGCTGAACTGGTCCAT 

GGAAGAAGAGGAGAGCAAAGGCTTAAAGAAGTTTTATCTATTGTCTGGAAGGT 

TGAGAAAGCTAATAATAAAACATAAGGTAAGGAATGAGACCGTGGCrTCTTGGT 

ACGGGGAGGAGGAAGTCTACGGTATGCCAAAGATCATGACTATAATCAAGGCC 

AGTACACTGAGTAAGAGCAGGCACTGCATAATATGCACTGTATGTGAGGGCCG 

AGAGTGGAAAGGTGGCACCTGCCCAAAATGTGGACGCCATGGGAAGCCGATA 

ACGTGTGGGATGTCX iCrAG CAGATTriGAAGAAAGACACTATAAAAGAATCTTT 

ATAAGGGAAGGCAACTTTGAGGGTATGTGCAGCCGATGCCAGGGAAAGCATA 

GGAGGTTTGAAATGGACCGGGAACCTAAGAGTGCCAGATACTGTGCTGAGTGT 

AATAGGCTGCATCCTGCTGAGGAAGGTGACITTTGGGCAGAGTCGAGCATGTT 

GGGCCTCAAAATCACCTACTTTGCGCTGATGGATGGAAAGGTGTATGATATCAC 

AGAGTGGGCTGGATGCCAGOjTGTGGGAATCrcCCXIAGATACCC\CAGAGTCC 

CITGTCAQ\TCTCATTTGGTTCACGGATGCC^ 

TTGTACAATATACCGCTAGGGGGC^CTATTTCnX^ 

CAACTAAAGTAAAAATGCTCATGGTAGGCAAOCTrGGAGAAGAAATTGGTAATC 

TXXjAACATCnTGGGTGGATCCTAAGGGGGCCTGCCGTGTGTAAGAAGATCACA 

GAGCACGAAAAATGCCACATTAATATACKKlATAAACrAACCGCATTrrrOGGG 

ATC^TGCCAAGGGGGACTACACXX^GAGC^XXXKjTGAGGTTCCXTACGAGCTT 

ACTAAAAGTGAGGAGGGGTCTGGAGACnt3GCTGGGCTTACACACACCAAGGC 

GGGATAAGTTCAGTOGACCATOTAAOOGCCGGAAAAGATCTACTGGTCrGTGA 

CAG^WTGGGACGAACTAGAGTGGTTIGCCAAAGCAACAACAGGTTGACCGATG 

AGACAGAGTATGGCGTCAAGACTGACrCAGGGTGGCCAGACGGTGCCAGATG 

TTATGTGTTAAATOCAGAGGOCX7ITAACATATCAGGATCCAAAGGGGCAGTOGT 

TCA CCIXX IAAAAGACAGGTGGAGAATTCACGTGTGTCACCGCATCAGGCACAC 

CGGCllUL'lUlXSAOCTAAAAAACITGAAAGGATGGTCAGGCTroOCTATATTTG T 

AAGCCTCCAGCGGGAGGGTGGTTGGCAGAGTCAAAGTAGGGAAGAATGAAGA £ 

GTCTAAACCTACAAAAATAATGAGTCGAATCCAGAOOGTrcrCAAAAAACACAGC M 

AGACCTOAC CGAG ATGGTCAAGAAGATAACCAGCATOAACAGGGGAGACTTCA 2 

AGCAGATTACnTGGCAACAGGGGCAGGCAAAACCACAGAACTOCCAAAAGCA => 

GTTATAGAGGAGATAGGAAGACACAAGAGAGTATTAGTTCTTATAGCATTAAGG O 

GCAG OGGCAGAGTCAGTCTACCAGTATATGAGATTGAAACACCCAAGCATCTC E 

TTTTAACCTAAGGATAGGGGACATGAAAGAGGGGGACATGGCAACCGGGATA 

ACCTATGCATCATACGGGTACTTCTGGCAAATGCCTCAACCAAAGCTCAGAGCr 

GCTATGGTAGAATACTCATACATATTUTTAGATGAATACX^TTG^ 

AACAACIXjGCAATTATOGGGAAGATCCAC^GATTTTG\GAGAGTATAAGGGTT 

GTCGCCATGACIX3CXZACGKrCAGCAGGGTCXjGTGACCACAACAGGTCAAAAGC 

ACCXIAATAGAGGAATTCATAGCCCCOGAGGTAATGAAAGGGGAGGATCTTGGT 

AGTCAGTTCXriTGATATAGCAGGGTTAAAAATACCAGTGGATGAGATGAAAGG 

CAATATGTTGGTTTTTGTACCAAGGAGAAAC^TGGCAGTAGAGGTAGCAAAGA 

AGCTAAAAGCTAAGGGCTATAACIXTGGATACTATrACAGTGGAGAGGATOCA 

GCCTVATCIGAGAGTTGTGAC^TCACAATCXXCCrATGTAATCGTGGCT 

GCTATTGAATCAGGAGTGACACTACCAGATTTGGACACGGTTATAGACACGGG 

GTTGAAATGTGAAAAGAGGGTGAGGGTATCATCAAAGATACCCTTCATGGTAA 

CAGGCCTTAAGAGGATGGCCGTGACTGTGGGTGAGCAGGCGCAGCGTAGGGG 

CAGAGTAGGTAGAGTGAAACCXX3GGAGGTATTATAGGAGCCAGGAAACAGCA 

ACAGGGTCAAAGGACTACCACTATGACXnxrTrGCAGGCACAAAGATACGGGAT 

TGAGGATX3GAATCAACGTGACGAAATCXJITTAGGGAGATGAATTACGATTGGA 

GCCTATACXjAGGAGGACAGCGTACrAATAACCCAGCTGGAAATACTAAATAATC 

TACTCATCTCAGAAGACTTGCCAGCCGCTGTTAAGAACATAATGGCCAGGACTG 

ATCACCCAGAGOCAATCCAACTTGCATACAACAGCTATGAAGTCCAGGTCCCG 

GTCCTGTTCCCAAAAATAAGGAATGGAGAAGTCACAGACACCTACGAAAATTAC 

TCGTTTCTAAATGCCAGAAAGTTAGGGGAGGATGTGCCCGTGTATATCTACGCT 
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ACTGAAGATGAGGATCTGGCAGTTGACCTCTTAGGGCTAGACTGGCCTGATCC 

TGGGAACCAGCAGGTAGTGGAGACTGGTAAAGCACTGAAGCAAGTGACCGGG 

TTGTCCTCGGCTGAAAATGCCCTACTAGTGGCTTTATTTGGGTATGTGGGTTAC 

CAGGCTCTCTCAAAGAGGCATGTCCCAATGATAACAGACATATATACCATCGAG 

GACCAGAGACTAGAAGACACCACCCACCTCCAGTATGCACCCAACGCCATAAA 

AACCGATGGGACAGAGACTGAACTGAAAGAACTGGCGTCGGGTGACGTGGAA 

AAAATCATGG G AGCC ATTTCAG ATT ATGC AGCTGGGGG ACTGG AG" 1 ' 1 ' 1 G 11 AA 

ATCCC AAGCAG AAAAG AT AAAAACAGCTCU 1' I'l G 1 1 ' 1 AAAG AAAACGCAG AAGC 

CXK^AAAGGGTATGTCCAAAAATTCATTGACTCATTAATTGA^ 

AATAATCAGATATGGTTTGTGGGGAACACACACAGCACTATACAAAAGCATAGC 

TGCAAGACTG<XXjCATGAAACAGCOTTTGCCACACrAGTGTTAAAGTGGCrAG 

C l'l ' 1' 1 GG AGGGG AATGAGTGTCAGACXIACXjrC AAGC AGGCGGC AGTTG ATTT A 

GTGGTCTATTATGTGATGAATAAGCCTTCCTrcCCAGGTGACTCCGAGACACAG 

CAAGAAGGGAGGCGATTOGTOGCAAGCCTGTTCATCTCCGCACTGGCAACCTA 

CACATACAAAACTIGGAATTACCACAATCTCTCTAAAGTGGTGGAACCAGCCCT 

GGCrTACCTCCOCTATGCrAOCAGCGCATTAAAAATGTTCACCCCAACGCGGCT 

GGAGAGCGTGGTGATACTGAGCACCACGATATATAAAACATACCTCrCTATAAG 

GAAGGGGAAGAGTGATGGATTGCTGGGTACGGGGATAAGTGCAGCCATGGAA 

ATCCTGTCACAAAACXZCAGTATCGGTAGGTATATCTGTGATGTTGGGGGTAGG 

GGCAATCGCTGCGCACAACGCTATTGAGTCCAGTGAACAGAAAAGGACCCTAC 

TTATG AAGGTGTTTGT AAAGAAC 1'IU 1' 1 GGATCAGGCTGCAACAGATG AGCTGG 

TAAAAGAAAACCCAGAAAAAATTATAATGGCCTTATTTGAAGCAGTCCAGACAA 

TTGGTAACCCCCTGAGACTAATATACCACCTGTATGGGGTTTACTACAAAGGTT 

GGGAGGCCAAGGAACTATCTGAGAGGACAGCAGGCAGAAACTTATTCACATTG 

ATAATGTTTGAAGCCTTOGAGTTATTAGGGATGGACTCACAAGGGAAAATAAG 

GAACCTGTCCGGAAATTACATITnXjATTTGATATACGGCCT £ 

CAACAGAGGGCTGAAGAAAATGGTACTGGGGTGGGCCCCTGCACCCTTTAGTT w 

GTOACTGGACCCCTAGTGACGAGAGGATCAGATTGCCAACAGACAACTATTTG 2 

AGGGTAGAAACX^GGTGCCCATGTGGCTATGAGATGAAAGCTTTCAAAAATGT P 

AGGTGGCAAACTTACCAAAGTGGAGGAGAGCGGGCCTTTCCTATGTAGAAACA O 

GACXTTGGTAGGGGACCAGTCAACTACAGAGTGACCAAGTATTACGATGACAAC E 

CTCAGAGAGATAAAAOCAGTAGCAAAGTTGGAAGGACAGGTAGAGCACTACTA 

CAAAGGGGTCACAGCAAAAATTGACTACAGTAAAGGAAAAATGCTCTTGGCCA 

CTGACAAGTGGGAGGTGGAACATGGTGTCATAACCAGGTTAGCTAAGAGATAT 

ACTGGGGTCGGGTTCAATGGTGCATACTTAGGTGACGAGCGCAATCACCGTGC 

TCTAGTGGAGAGGGACTGTGCAACTATAACCAAAAACACAGTACAGTTTGTAAA 

AATGAAGAAGGGGTGTGCGTTCACCTATGACCTGACCATCTCCAATCTGACCA 

GGCTCATCGAACTAGTACACAGGAACAATCTTGAAGAGAAGGAAATACCCACC 

GCTACGGTGACCAC^TGGCTAGCTTACACX7ITCGTGAATGAAGACGTAGGGAC 

TATAAAACCAGTACTAGGAGAGAGAGTAATCCCCGACCCTGTAGTTGATATCAA 

TTTACAACX1AGAGGTGC\AGTGGACACGTCAGAGGTTGGGATCACAATAATTC 

GAAGGGAAACXXTGATGACAACGGGAGTGACACCTGTCTTGGAAAAAGTAGA 

GCCTGACGCCAGCGACAACCAAAACTCGGTGAAGATCGGGTTGGATGAGGGT 

AATTACCCAGGGCCTGGAATACAGACACATACACTAACAGAAGAAATACACAA 

CAGGGATGCGAGGCCCTTCATCATGATCCTGGGCTCAAGGAATTCCATATCAA 

ATAGGGCAAAGACTGCTAGAAATATAAATCTGTACACAGGAAATGACCCCAGG 

GAAATACGAGACTTGATGGCTGCAGGGCGCATGTTAGTAGTAGCACTGAGGGA 

TGTCGACCXZTGAGCTGTCTGAAATGGTCGATTTCAAGGGGACrrrrri AGATAG 

GGAGGCCCTGGAGGCTCTAAGTCTCGGGCAACCTAAACCGAAGCAGGTTACCA 

AGGAAGCTGTTAGGAATTTGATAGAACAGAAAAAAGATGTGGAGATCCCTAAC 

TGGTTTGCATCAGATGACCCAGTATTTCTGGAAGTGCjCCTTAAAAAATGATAAG 

TACTACTTAGTAGGAGATGTTGGAGAGGTAAAAGATCAAGCTAAAGCACTTGG 

GGCCACGGATCAGACAAGAATTATAAAGGAGGTAGGCTCAAGGACGTATGCCA 

TGAAGCTATCTAGCTGGTTCCTCAAGGCATCAAACAAACAGATGAGTTTAACTC 
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CACTGTTTGAGGAATTGTTGCTACGGTGCCCACCTGCAACTAAG 

GGGCACATGGCATCAGCTTACCAATTGGCACAGGGTAACTGGGAGCGCCTCGG 

TTGCGGGGTGCACXTAGGTACAATACCAGCCAGAAGGGTrGAAGATACAOCCAT 

ATGAAGCITACCTGAAGTrGAAAGATTTCATAGAAGAAOAAGAGAAGAAACXT 

AGGGTTAAGGATACAGTAATAAGAGAGCAGAACAAATGGATACTTAAAAAAAT 

AAGGTTTCAAGGAAACCTG\AC^CCAAGAAAATGCTCAA(XCTGGGAAACTATC 

TGAAC^GTTGGACAGGGAGGGGCGCAAGAGGAACATCTACAACCACCAGATT 

GGTACTATAATGTCAAGTGCAGGCATAAGGCTGGAGAAATTGCCAATAGTGAG 

GGCCCAAACCGACACCAAAACCTTTCATGAGGCAATAAGAGATAAGATAGACA 

AGAGTGAAAACXTGGCAAAATCXIAGAATTGCACAACAAATTGTTGGAGATnTCC 

ACACGATAGCCCAACCCACCCTGAAACACACCTACGGTGAGGTGACGTGGGAG 

CAACTTOAGGCGGGGATAAATAGAAAGGGGGCAGCAGGCrrCCTGGAGAAGA 

AGAACATCGGAGAAGTATTGGATTCAGAAAAGCACGTGGTAGAACAATTGGTC 

AGGGATCTGAAGGCCGGGAGAAAGATAAAATATTATGAAACTGCAATACCAAA 

AAATGAGAAGAGAGATGTCAGTGATGACTGGCAGGCAGGGGAOCTGGTGGTT 

GAGAAGAGGCCAAGAGTTATCCAATACXXrTGAAGCCAAGACAAGGCTAGCCAT 

CACTAAGGTCATGTATAACTGGGTGAAACAGCAGCX3CGTTGTGATTCCAGGAT 

ATGAAGGAAAGACXXXXnTGTTCAACATCnTGATAAAGTGAGAAAGGAATGG 

GACTCGTTCAATGAGCXIAGTGGCXXnAAGTTTTGACACCAAAGCC^ 

TCAAGTGACTAGTAAGGATCroCAACTTATTGGAGAAATCX^GAAATATTACTA 

TAAGAAGGAGTGGCACAAGTTCATTGAC^GCATCAGCGACCACATGACAGAAG 

TACCAGTTATAACAGCAG ATGGTGAAGTATATATAAGAAATGGGCAGAGAGGG 

AGCGGCGA.GCCAGACACAAGTGCTGGCAACAGCATGTTAAATGTCXZTGACAAT 

GATGTACXXXnTCTXKXSAAAGCACAGGGGTACCGTAC^AGAGTTTCAAC^GGG 

TGGCAAGGATCGACGTCIXjTGGGGATGATGGCITCTTAATAACTGAAAAAG^ 

TT AGGGCIGAAATTTGCTAAGAAAGGGATGCAGATTCTTCATGAAGCAGGCAA 

ACXZTCAGAAGATAACXXJAAGGGGAAAAGATGAAAGTTGCCrATAGATTTOAGG 

ATATAGAGTTCIXJITCTCATACCCXIAGTCCXIITGTTAGGTGGTCCGACA^ 

GTAGTCACATGGCCGGGAGAGACACCGCTGTGATACTATCAAAGATGGCAACA 

AGATTGGATTC^AGTGGAGAGAGGGGTACX^CAGCATATGAAAAAGCXXn'AG 

CCTTCAG 11 ' 1 C I ' 1 GCTG ATGT ATTCCTGG AACCCGCTTGTT AGG AGG ATTTGCCT 

GTTGGTCCTTTCGCAACAGCCAGAGACAGACCCATCAAAACATGCCACTTATTA 

TTACAAAGGTGATCCAATAGGGGCCTATAAAGATGTAATAGGTCGGAATCTAA 

GTGAACTGAAGAGAACAGGCTTTGAGAAATTGGCAAATCTAAACCTAAGCCTG 

TOCACGTTGGGGATCTGGACTAAGCACACAAGCAAAAGAATAATTCAGGACTG 

TGTTGCCATTGGGAAAGAAGAGGGCAACTGGCTAGTTAACGCCGACAGGCTGA 

TATGCAGCAAAACrGGGCACTTATACATACCTOATAAAGGCTITACATTACAAG 

GAAAGCATTATGAGCAACTGCAGCTAAGAACAGAGAC^AACCXXKjKATGGGG 

GTTGGGACTGAGAGATACAAGTTAGGTCX^CATAGTCAATCTGCTGCTGAGAAG 

GTTGAAAATTCTGCTCATGAOGGCCGTCGOCGTCAGGAGCTGAgiw'Bnnatt^nfatanp 

aaataaattaafccatgtaratagtgtatataaalalagttgggaccj^^ 

tagggaagacctctaacagcccoc 
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